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A characteristic feature of the state-of-the-art of real-space methods in electronic structure calculations is 
the diversity of the techniques used in the discretization of the relevant partial differential equations. In 
this context, the main approaches include finite-difference methods, various types of finite-elements and 
wavelets. This paper reports on the results of several code development projects that approach problems 
related to the electronic structure using these three different discretization methods. We review the ideas 
behind these methods, give examples of their applications, and discuss their similarities and differences. 
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1 Introduction 

In this paper, numerical methods for the solution of the Kohn-Sham equations Q] of the density-functional 
theory (DFT) (2l are discussed'. We are slowly moving our emphasis from the ground state DFT towards 
its time-dependent (TD) extension TDDFT |3 1. Much of the discussion of the present paper is relevant in 
that case as well. 

In solid-state physics and quantum chemistry the standard discretization methods are today the plane- 
wave methodology and the linear combination (LCAO) of atomic orbitals. It has been recognized that 
despite their merits, they both have certain shortcomings which motivate the development of new methods. 
The plane-wave basis gives accurate results with one single convergence parameter, the cutoff energy. 
On the other hand, it gives a uniform resolution across the entire calculation volume and application of 
local refinements at the core regions of atoms do not very naturally fit into this formalism^. All-electron 
calculations are inpractical, but pseudopotentials and the projector augmented wave-method (PAW) (9] 
[Tol have been developed to circumvent this problem. The domain decomposition method for massive 
parallelization requkes a description in terms of local quantities. In the large matrix inversion step in the 
Green's function approach (Sec. lS.St one benefits from the sparsity of the matrix, which is a consequence 
of the use of a local basis set. Also in other contexts, such as the linear scaling method with localized 
support functions^, the use of a local basis set or description on a grid seems to be a very natural choice. 
The LCAO basis is a local basis and typically tailored to the system so that big systems can be calculated 
with a small number of basis functions. On the other hand, with the atom-centered basis functions it may 
be difficult to systematically increase the size of the basis set towards the so-called basis set limit. Extra 
care needs to be taken to choose special basis sets for the description of excited states in TDDFT. The 
description of dynamical phenomena beyond linear response such as the ionization of atoms or molecules 
under strong laser pulses may be problematic. 

We present an introduction to three systematic real-space methods: the finite-difference (FD) method, 
the finite-element (FE) method and the wavelet method. The last two use local basis sets and the FD 
method is based on a discretization of the differential operator (or sometimes of the entire differential 
equation) that involves only local information. Thus these discretization methods are well suited for models 
where locality plays a crucial role. They also are systematic in the sense that they have relatively simple 
convergence parameters. They all are also generic, which means that they are not tailor-made for a specific 
problem. Besides of the similarities, all the methods have their own characteristic properties, which make 
them better under some circumstances and worse under others. 

The ease of implementation of a particular method naturally depends on the background of the devel- 
oper, as well as on the availability of software hbraries that can be reused during the process. One would 
think that the finite difference method is the easiest approach, because it does not depend on basis func- 
tions. Handling with basis functions typically needs more work, at least in the first place. However, the 
comparison is not so clear. In the case of the FE-method, there exist general purpose open-source pro- 
gram packages f 14 "151, which include tools for the mesh generation, construction of the matrices and for 
solving the resulting linear systems of equations and eigenvalue problems so that the programmer does not 
need to start everything from scratch. An example of the utilization of such packages is given in Sec. 15.41 
Furthermore, the state-of-the-art pseudopotential FD-method in fact involves nowadays also the use of an 
interpolation basis, which is required in the double-grid treatment of the pseudopotential operator 1 16J. 



As the emphasis is on the numerical methods, it has to be pointed out that the discussion is in fact more general and is equally 
well applicable to e.g. Hai'ti'ee-Fock (HF) equations or other formulations of computational quantum mechanics. One of our examples 
is in fact a HF calculation of atomic orbitals (see Sec. 15. 61 . 

However, the method of adaptive coordinates, now popular in finite-difference (FD) |^4j^ and finite element (FE) \^ methods, 
was originally introduced in the plane- wave context I7II8I . 

^ The terminology for these localized functions varies in the litterature: Hernandez et al. fl 1' used the term of support function, 
whereas Skylaris et al. 1 12 1 call them nonorthogonal generalized Wannier functions (NGWF's). The approach of Fattebert et al. 1131 
suggest the term optimized localized orbital (OLO). 
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The generality of a numerical method is also connected with the numerical eiTor control. In the ultimate 
fully adaptive and systematic method the user can predefine a level of numerical accuracy for the physical 
observables. The details of the calculation can then be adjusted such that the desired accuracy is reached. 
The requirement of systematic convergence is easily available in the case of all systematic methods with 
uniform accuracy as the accuracy is controlled by a single parameter in such methods. For non-uniform 
basis sets the convergence parameters vary in space and naturally are more complicated. 

Even if the convergence analysis with a non-uniform discretization is more complicated, spatial adaptiv- 
ity clearly is a desirable feature. Many molecular systems include much empty space, where it is not useful 
to invest so much for accuracy. On the other hand, close to the nuclei a greater resolution is needed, even 
when the Projector Augmented Wave (PAW)-method |9 10 1 or pseudopotentials are applied. The system 
may also consist of a number of different atoms, each of which require a different resolution in its core 
region - when uniform accuracy is demanded, the most difficult atom defines the global resolution. Al- 
though it may often be wise to rely on pseudopotential or frozed core methods in practical calculations, it is 
a desirable feature of the numerical method to deliver also all-electron results when requested. Obviously, 
this is unfeasible for methods with uniform spatial resolution. 

When the computer resources increase the good parallelizability of the method becomes important. The 
local nature of each of the three discretization methods allows for the utilization of domain decomposition 
methods. Another important point in the context of large systems is the compatibility of the method to the 
linear scaling formalisms. Popular varieties of these 0{N) methods involve localized support functions 
which are most naturally expanded in terms of localized basis functions II17I . or presented on a grid which 
spans only a part of the large system |'131. However, it is also possible to use a plane-wave description 
within these localized boxes 1 121 . 

Discretization of the differential equations is only a part of the numerical work, solution of large linear 
systems of equations and eigenvalue problems is not a trivial task. For large systems the solution of the 
eigenvalue problem becomes the dominant part of the calculation in the traditional approach, where the 
global orbitals from the KS-equations are directly solved, scaling as the cube of the number of atoms. 
Each of the three discretization methods lead to sparse matrix problems. Many methods in linear algebra 
are generic in the sense that they do not depend on the discretization, but others may do so. For example, 
the multi-grid methods require a hierarchy of discretizations of different resolutions and restriction and 
extension operators between them which are trivial to construct in the FD and wavelet approaches, but 
slightly more challenging in the FE-context. 

Ab initio molecular dynamics, either in its Car-Parrinello 1181 or Born-Oppenheimer 1191 variety, is a 
very important area of applications in our field. Every general purpose program package has to be capa- 
ble of performing such calculations, and when a discretization technique is introduced, it is important to 
address the question of its feasibility. All three discretization methods discussed in this paper are in prin- 
ciple compatible with ab initio molecular dynamics. With FE-methods in adaptive coordinates, pioneering 
calculations have been recently presented by Tsuchida |20|. As we discuss in Sec. 16.21 such calculations 
can also be performed with a uniform FE-mesh or with the general unstructured tetrahedral mesh, an ap- 
proach to local refinements which we recommend instead of the method of adaptive coordinates. Also the 
finite-difference method has been shown feasible in the context of Car-Parrinello type molecular dynamics 

L2l|. 

We have opted for a structure that follows the idea of separation between the definition of the model 
of the physical system, typically as a set of coupled integrodifferential equations, discretization of the 
continuous equations, and solution of the discrete equations. In this paper we address the last two of these 
topics. The detailed structure of the paper is as follows: In Sec. 2. we introduce the discretization methods. 
In Sec. 3. we discuss methods of linear algebra for linear systems of equations and for eigenvalue problems. 
In Sec. 4. we introduce briefly the three lines of work in which the authors of this paper have been involved 
in, and the six code development projects that are associated with these. In Sec. 5. we present some 
calculation examples generated by these projects. In Sec. 6. we discuss our future development plans and 
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open questions. In Sec. 7. we discuss the similarities and differences of the discretization methods, and 
finally in Sec. |8]we summarize. 



2 Discretization methods 

In most of the calculations discussed in this paper the main computational task is to solve numerically a 
single-particle Schrodinger equation of the form 

F^, = e,V«, i = l,...,N (1) 

where H is a Hamiltonian operator ipi and are the single-particle orbitals and eigenvalues, respectively. 
In the case of electron transport problems (Sec. lS.St . one solves instead for the retarded Green's functions 
G" from the following equation, which has to be solved repeatedly, for multiple values of positions r' and 
electron energies uj: 

(u;-H^G''ir,r';uj)^S{r^r'). (2) 

In addition, the Poisson equation 

V^Vcir) = -47rp(r) (3) 

must be solved to obtain from the total charge density p{r) the electrostatic potential Vc(r) which enters 
the Hamiltonian H in Eq.^or Eq.|2] The Hamiltonian H in the above equations may originate from the 
density-functional theory or Hartree-Fock theory. In both cases, the Hamiltonian depends on the solutions 
of the equations, which makes the problem nonlinear. 

For clarity, we define an example Hamiltonian H from density-functional theory in the presence of 
Kleinman-Bylander (KB) 1221 form of pseudopotentials by its action on a test function 7](r): 

iivir) ^ -^V'vir) + VMirWr) +J2''r.Cm{r -Ra) J Cmi^' ~ RaHr')dr' . (4) 

aim 

The three terms of the Hamiltonian are referred to as the kinetic energy operator, effective potential energy 
operator* Vcs [n] (r) and the pseudopotential operator defined by the help of atom centered (nucleus at Ra) 
angular momentum {I, m) dependent projectors (r — Ra) and the pseudopotential operator^. 

The numerical solution consists of a discrete presentation of the infinite-dimensional problem and then 
solution of the resulting discrete problem (this solution step is the topic of Sec. |3}. In this section we 
present three discretization methods, all of which are often referred to as systematic real-space methods. 
In Sec.O we introduce the finite-difference method. The other two methods are both variational methods, 
thus we discuss them first on a common footing in Sec. 12.21 Thereafter we discuss the finite-element 
method in Sec. 12.31 and the wavelet approach in Sec. 14.61 Other systematic real-space methods exist as 
well, for example, the discrete variable representation (DVR) method 1 23 1 and the Lagrange mesh method 
|24|. For historical perspective, we mention in this context also the highly accurate methods for diatomic 
molecules by Pyykko et al. |25| and Becke et al.|26|, as well as the method for polyatomic molecules by 
Becke et al. 1271 and for solids by Springer 1 28 1. 



The notation emphasizes the functional dependence of the effective potential on the density, which on the other hand is de- 
termined from the eigenfunctions of H. The detemination of the self-consistent density is therefore a nonlinear problem, which is 
further discussed in the preamble of Sec.|3] 

^ For convenience of notation, we hereafter enumerate these projectors with a single index (such an enumeration obviously exists) 
n = n(a, I, m), and use projectors with shifted origin S,n{a,l.m){^) = ?aim('' ~ R-a)- 
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2.1 Finite difference methods 

The finite difference method is a popular numerical approach because of its conceptual simplicity. No basis 
functions are involved, which makes the implementation of a computer program very easy. A uniform grid 
is utilized, just as the Fourier grid which appears in fast Fourier transform routines. The method is thus 
ideal, if one also wants to reserve the opportunity to evaluate some operators in /c-space, as e.g. in the 
algorithm described in Sec. 13. 2. 41 It is also straightforward to obtain the hierarchy of coarser grids required 
in multilevel (multigrid) methods, which are often the best available preconditioning methods for iterative 
solvers in linear algebra. 

The central idea of the simplest version of finite difference (FD) method is a discrete representation of 
the partial differential operators. In the case of our example Hamiltonian (Eq.|3 the emphasis is thus on 
the kinetic energy operator In some FD-approaches the entire differential equation is discretized and not 
only the differential operators 1 29l I30l l3Tl . For clarity of presentation, we discuss here only the simpler 
variety, which is also more widely used in electronic structure calculations. 

In the discretization procedure the function under scrutiny is sampled in an evenly spaced point grid in 
three dimensions. This provides the necessary mapping from the function 77(r) in the infinite-dimensional 
Hilbert space of the model to a vector e in a finite-dimensional space. The second derivative at a grid point 
is approximated by a weighted sum of the values of the function in neighboring grid points 



JV 



(5) 



where {xi,yj,Zk) = {ihx,jhy,khz) with the grid spacings h^^y^z- The coefficients c„ are derived by 
requiring this formula to be accurate for the {2N)t\\ order polynomial fitted to the values at the 2N + 1 
sampling points above. The values of these coefficients up to = 6 can be found from literature, for 
example, from Ref. |32|. Note that the polynomial is different when the recipe is applied at neighboring 
points. Thus the FD-scheme does not implicitly define an interpolating polynomial as an element of the 
original Hilbert space. Nevertheless this treatment of the kinetic energy operator is quite accurate for 
smooth functions. 

The projector functions ^„ in the pseudopotential operator are also represented by their values Xnijk 
at the grid points in the simplest implementation of the FD-approach. The integrals occurring in the 
pseudopotential operator are approximated by the trapetsoid rule 

(V'"'77)(ryfc) ^C„^„(ry7c) / in{v')ll(v')dv' W ^^CnXnijk ^ Xni'j'k'Gi'j'k' ■ (6) 
ri n i'j'k' 

It has been recently widely recognized, that this sampling of the pseudopotential projectors often re- 
quires very fine grids to be sufficiently accurate^. When too coarse grids are employed, one encounters 
problems such as the egg box ejfect ['3 3). This is the spurious dependence of the total energy as a function 
of the atom position when the atom moves from one grid point to another, and is mainly a consequence of 
the poor sampling of pseudopotential projectors. 

In order to improve the discretization accuracy of the pseudopotential operator, Ono and Hirose (OH) 
II6I proposed to use polynomial interpolation to obtain values on a finer grid from the discrete values e, 
and use trapetsoid rule on this finer grid^. After a straightforward manipulation they find that the discrete 



Even more problematic formulas occur for the forces, as they involve the derivatives of the projectors that vary more rapidly 
than the projectors themselves. 

^ Note that the interpolating polynomial is necessarily different from the polynomial used in obtaining the second derivative in 
Eq|5] although a polynomial of the same order can be used. The interpolation procedure involves the association of a piecewise 
polynomial basis function of product form to each grid point. 
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version of the pseudopotential operator can be evaluated on the coarse grid as 

(y"'r?)(rijfe) f^^cixiijk ^ xii'j'k'ei'j'k'- (7) 

I i'j'k' 

Here the discrete values for the projectors are given by the fine grid trapetsoid rule approximation of 

xiiik = j ^i{r)(t)ijk{r)dr. (8) 

Above the (f)ijk (r) is a piecewise polynomial basis function from Lagrange interpolation, with value of 
unity at the point (i, j, k) and zero in other points. It is a cartesian product of three piecewise polynomials 
in one dimension. Although the matrix elements of the discrete pseudopotential operator need not be 
stored, it is instructive to write the formula 



V^k,i'j'k' ='^cixiijkxii'j'k' = / ^yfc(r)l/"Vi'j'fc'(r)(^r, 
I '' 



(9) 



where the required integrals are approximated by a trapetsoid rule on a fine uniform grid. 

The local potential energy operator is diagonal - sampled by its values at the grid points - in all practical 
implementations of the FD-approach. Therefore, in practical implementations where the OH-scheme is 
used, it is beneficial to exploit the freedom of choice within the pseudopotential approach by picking a 
very smooth local part for the pseudopotential. 

2.2 Variational methods with basis function discretization 

Variational methods are those discretization methods which involve a set of basis functions. In this paper 
we present two methods which use them - the finite-element method and the wavelet method. The main 
idea in variational methods for solving partial differential equations is to multiply the equation by a test 
function and then integrate the identity over the computational domain. For instance, this leads us to the 
eigenvalue problem 

aiy(V'>»?) = j 'n{Y)H'il){Y)dY = e j tp{r)r]{r)dr = £{ilj,r]) (10) 

where r] is our test function, and aw{-, •) is the bilinear form for our wavelet method, and the last equality 
sign defines the inner product (•, •). In the finite-element method implementations in this paper, an inte- 
gration by parts of the laplacian is perfomed to obtain a symmetric bilinear form, and to treat the boundary 
terms in a natural way. Thus, with our example Hamiltonian the biUnear form for the finite-element method 
would read 

aFE{i^,v)= J I^^VV(r) • Vj?(r) + V-(r)V(r)ry(r) + g c^^W J ^i{r')i;{r')dr'rj{r)^ dr 
+ boundary terms. 

(11) 

and the variational eigenproblem would be 

aFE{^,v) = £i'^,v) (12) 
for all test functions r]. 
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The discretization in variational methods is obtained simply by choosing finite-dimensional spaces Vh 
and Th to approximate the functions tp and 77 in the above eigenproblems. Then one obtains with trial 
functions {V'jljLi^ V'i G and with test functions {rij}"^^, r/j e T\ the Petrov-Galerkin condition 

aw/FE{-4'h,Vk) ^ £h{il^h,Vk), k^l,...,n (13) 

where 

n 

'^h = ^CjiPj. (14) 

This leads to a finite-dimensional generalized eigenvalue problem 

He = EhSc (15) 

where H is the discretized hamiltonian, Hij = aw/FE{''Pj, Vi)^ S is the overlap matrix, Sij — {ipj, rji), 
and c is the vector of the unknown coefficients Cj. 

In variational methods it important to consider the choice of the finite-dimensional subspaces Vh and 
Th- This is the point where the finite-element method and the wavelet method we consider diverge from 
each other. In particular, in the finite-element method we follow the usual convention and select Vh = Th 
whereas our implementation of the wavelet method uses different spaces for the trial and for the test 
functions. 



2.3 Finite-element method 

The finite-element method (FEM) 1341 l35l 1361 is widely used in many different fields, for example, in 
structural mechanics, fluid dynamics, electromagnetics and heat transfer calculations. The popularity of the 
FEM comes from it's flexibility. It allows different geometries and boundary conditions to be implemented 
in a straightforward way. Possibility to use refinements and higher order polynomials in the basis reduce 
the number of the basis functions in comparison for example to the number of grid points in the finite 
difference approach. Because of the popularity of the method there is a lot of theoretical work and useful 
tools available. A recent review of the state of the art for FEM in electronic structure calculations can be 
found in Ref. |37|. 

In the FEM the calculation domain 51 is divided into small regions called elements. In three-dimensional 
calculations tetrahedral, hexahedral and pyramid-shaped elements are used. Pyramids are needed to com- 
bine the meshes consisting of hexahedra and tetrahedra together. Hexahedral (box-shaped) elements are 
good if the calculation volume is needed to be filled uniformly. Instead tetrahedra have the advantage that 
their size can easily vary inside the domain region for example to increase the accuracy in the core regions 
of the atoms. Generating high-quality meshes is a nontrivial task, and until recently reliable, free and 
user friendly three-dimensional mesh generators have not been easily available. The situation is changing 
rapidly however, as high quality open-source mesh generators have become available |38 14 1. 

The basis functions are constructed conforming to the mesh of the elements so that they are non-zero 
only in a few neighboring elements. This ensures the local nature of the basis functions and the resulting 
matrices become sparse. The utilization of local basis functions also enables the parallelization of the 
problem based on domain decomposition methods. Small elements imply many basis functions which 
yields a good numerical accuracy. This is how we can increase the accuracy in the regions where the 
solution changes fast. This is particularly useful in many atomistic calculations involving hard norm- 
conserving pseudopotentials because it is easy to increase the accuracy near the core regions of the atoms. 
Many systems also contain a lot of empty space where large elements can be used. 
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2.3.1 Finite-element p-basis 

There are some options on how to choose a good finite-element basis II34I . The simplest choice is to use 
the linear elements so that the basis function is unity in one of the nodes and declines linearly to zero 
towards the boundaries of the element. The linear basis is easiest to implement, but in order to achieve a 
better convergence high-order elements are used. For a smooth solution there is a remarkable difference 
in the accuracy between the linear and higher-order element calculations with the same number of basis 
functions. 

The p-elements are hierarchical in the sense that the higher-order basis set includes also the lower-order 
basis sets |39|. A basis set has four types of functions, node- edge-, face-, and element-based functions. 
The node-based functions, which are linear, are nonzero only in the volume of the elements which have a 
common node. Similarly an edge based function is nonzero in the elements which have a common edge. 
The element-based functions have their support only inside one element. 

The basis function set of the p-elements is derived using the Legendre polynomials. This ensures that 
their derivatives are more orthogonal to each other than in the traditional case of nodal basis functions. 
The orthogonality makes the solutions numerically stable even when using polynomials of high order. 
Otherwise the conditioning of linear systems can be a problem. The one-dimensional basis functions in 
the reference element are shown in Figure^ 




Fig. 1 One-dimensional basis functions in the reference element [-1,1] up to the fourth order. Inside a element there 
ar two node-functions and three element-functions 

The high order element includes more basis functions than the low order one. Where one tetrahedral 
linear element includes four linear basis functions, the fourth order one has 35. This means that the size of 
the elements is bigger and the overlap between basis functions is larger than in a linear mesh. Therefore 
when the number of the basis functions is reduced, but at the same time the filling of the coefficient matrix 
is increased. The filling of the matrices increases the CPU time and memory requirements when solving the 
equations, while the reduction of the number of basis function works in the opposite direction. However, 
a smooth solution converges faster for a larger filling. Therefore the critical factor for a good convergence 
with high-order polynomials is the smoothness of the solution. Naturally, for each problem there is an 
optimal polynomial order. Because the hierarchical nature of p-elements it is possible to change the order 
of the elements inside one calculation mesh. This results in the so-called hp methods, where the polynomial 
order p is variable in space as well as the size h of the elements. This is useful if the nature of the solution 
has rapid changes in some part of the calculation area, but otherwise the solution is smooth. This is the 
case for example in the DFT calculations in the atomic core region. 
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In adaptive methods, an interesting question is how to choose whether to increase the order or decrease 
the element size in a given part of the mesh where resolution needs to be enhanced. 

In the case of all-electron calculations, where the effective potential has a singlarity of the type 1/r, 
when the same mesh is utilized for the interpolation of the effective potential and the orbitals, the standard 
results of approximation theory 1 36 1 suggest a refinement strategy where small elements with low order 
p are used in the vicinity of the nuclei. In other parts of the space, the orbitals as well as the effective 
potential are smooth, which suggests larger and higher order elements. 

Multilevel methods can be also of the "multi-p"-variety, where the same mesh is used at each resolution 
level, but the order of the elements varies. 

2.4 Wavelets 

The variational approximation scheme using the wavelet basis as the trial and test functions will combine 
the best aspects both of the FEM approach and Fourier approximation. From the FEM world we retain the 
locality of the approximation. For a wide class of operators their matrix representation H in the wavelet 
basis is nearly diagonal. This fact together with the refinement equation makes it possible to carry out 
the matrix-vector computations in EFT speed. It can be shown that the wavelet based approximation 
schemes for operator equations have the optimal computational complexity in the following sense: For 
a given approximation tolerance e > the wavelet approximation, when computed using iterative and 
adaptive techniques, requires the least number of arithmetical operations and storage locations of all A^- 
term approximation schemes with the same tolerance |40|. 

Furthermore, the wavelet transform yields powerful numerical schemes including very efficient adaptive 
algorithms. So the wavelet approximation is naturally adaptive. In addition to that the wavelet based 
approximation method combined with the multilevel techniques (i.e. the multiresolution analysis) are very 
good preconditioners for linear systems |41 1. It can be shown for a very general class of operators that the 
scaled stiffness matrix has uniformly bounded condition numbers |42|. 

The rate of approximation depends on the smoothness of the function to be approximated and the order 
of the wavelets. Assuming that the solution tp of the eigenvalue problem Q is sufficiently smooth it can 
be shown (see |43 1) that 

\\i^-'4'h\\Lp(R'') <c{ip)n'^. (16) 

Here n denotes the number of basis functions on the highest resolution level of the wavelet approximation, 
s is the approximation order of the wavelets and the constant c{ip)is independent on n. This approximation 
property is exactly the same as for the polynomial finite element spaces. 

The application of wavelet analysis in M.'^ has been considered inpractical because the number of scaling 
functions increases which affects the size of the wavelet basis. This holds for the tensor product wavelets, 
or separable wavelets, which will be obtained from the dyadic wavelet basis in one dimensions. However, 
by using the non-separable wavelet basis we will retain the same situation as in one-dimensional case. We 
have only one scaling function. These wavelet basis will be obtained by using the isotropic scaling matrix 
M with the determinant det(Af ) = 2, i.e. the matrix has integer coefficients. The non-separable wavelet 
basis have one more advantage over conventional separable wavelet functions. Because of their isotropy 
they are useful for rotationally invariant systems. 

2.4. 1 Interpolating wavelet basis set 

A biorthogonal wavelet family of degree m is characterized by a mother scaling function ifi, a mother dual 
scaling function cp, and four finite filters hj , gj , hj , and gj . We follow the convention in f44\ where the 
nonzero elements of the filters lie in the range j — ~m, . . . ,m. The mother wavelet ip and the mother dual 
wavelet are determined by the mother scaling function, the mother dual scaling function, and the four 
filters. 
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Interpolating wavelets Il45ll46ll44l are one biorthogonal wavelet family II47II44I . Interpolating wavelets 
enable simple calculation of matrix elements and expansion of functions in a basis function set because of 
the special form of the dual scaling functions and dual wavelets. Actually the interpolating dual scaling 
functions and interpolating dual wavelets are not functions but distributions. 

For interpolating wavelet family of degree m, the mother scaling function (p is constructed by recur- 
sively applying polynomial interpolation of degree m to data Si — 6ifi and the mother dual scaling function 
is 

^{x)^S{x), (17) 
where 6 is the Dirac delta function. For interpolating wavelets the coefficients hj satisfy 1441 

= '/^(j/S), j = -m,...,m. (18) 

For biorthogonal wavelets the matrix elements of operators are not computed as ordinary inner products 
but they are computed as integrals involving the basis functions and so called dual basis functions. The 
matrix elements of an operator ^ in a basis set consisting of biorthogonal wavelets are defined by 

/oo 
Q{x) ACj{x)dx, (19) 

where Q, j = 1, . . . , N aie the basis functions, Q, i = 1, . . . , TV are the dual basis functions, and N is 
the size of the finite basis function set. 



3 Linear algebra 

The numerical task of solving the discretized versions of the Kohn-Sham equations, Hartree-Fock equa- 
tions or equations for the Green's functions consists of repeated solutions of linear systems of equations 
and/or eigenvalue problems, whereas the real-time propagation methods of TDDFT involve exponentia- 
tion of large matrices. In other words, problems of linear algebra |48J. All three discretization techniques 
discussed in Sec.|2|lead to sparse matrix problems. Typically these matrices are so large that it is impor- 
tant to store only their nonzero elements, using e.g. the CRS (Compressed Row Storage) format |49 1 or 
define the matrices only by a subroutine which operates upon a vector**. Notably, in the FD-method with a 
uniform grid the nondiagonal elements of the discretized Laplacian are the same on each row and need not 
be stored. 

Because of the large size of the matrices, straightforward utilization of standard linear algebra packages 
for dense matrices, such as Lapack 1501 . is not an option'. Availability of a good selection of efficient tools 
for sparse matrix problems is thus a necessary prerequisite for large scale real-space electronic structure 
calculations. 

It was emphasized in the notation of Eq. |4] that a part of the Hamiltonian H, the effective potential 
Vcff [n](r), is a functional of the electron density, which on the other hand needs to be determined from 
the sum of the squared moduli of the occupied orbitals, which are the eigenfunctions of H. We are thus 
presented with a nonlinear eigenvalue problem"'. There exists a general multigrid formulation called Full 

^ One often, e.g. in the method of Secs. l3.2!^ and l4!2l as well as in the standard plane-wave approach, defines the "matrix" for 
the kinetic energy operator as a sequence of an FFT transform, multipHcation by — ifc^, and an inverse FFT transform. In that case, 
of course, the underlying discretization method is none of the three of Sec.lzl 

' There exists, however, at least one promising starting point for the equivalent standard library for sparse (and "matrix-free", see 
below) matrix problems, called Sparskit 1 51 1. Utihzation and further development (if necessary) of such libraries is important in our 
opinion - the recycHng of as many tools as possible is one of the main current trends in real-space electronic structure calculations as 
well as more generally in computational science. 

Strictly speaking, an eigenvalue problem with fixed potential is also a nonlinear problem, as the eigenvalue and eigenfunction 
need to be determined simultaneously. Nevertheless, such problems ai'e generally regai'ded as subfields of lineal' algebra. 
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Approximation Storage (FAS) 1521 which is in principle directly applicable to such nonlinear problems. 
Results from work on the development of nonlinear FAS eigenvalue solvers have been reported by Beck 
et al." L54. 55 1 and Costiner and Ta'asan |56|. Apart from that, one can try to minimize directly the 
corresponding nonquadratic total energy functional using the preconditioned steepest descent (SD) 1291 15 71 
or conjugate-gradient (CG) |58 1 method, in which the electron density and hence the effective potential can 
be updated after each update of the approximate orbitals. However, an iterative diagonalization scheme, 
in which the Hamiltonian is repeatedly diagonalized for a fixed Vi^ff , and the self-consistent Vcff is found 
using some "mixing scheme" in an outer loop, can be at least equally efficient [59 '601 ■ We have opted for 
the latter approach in all of our example calculations presented in Sec.|5] Some of the systems considered 
require special care in the choice of the mixing scheme in order to find a convergent iteration''^. Specifically, 
we want to mention here the GRPulay method 1 6 1 1 and the Newton-Raphson (or response function) method 
of Refs. 1,62, .63J which we have found useful in Sec. l5.5l and in Secs. l5.ll and l5.2l respectively. For more 
information on our recent work related to self-consistency iterations see Refs. 1 641 1651 . 

Apart from the nonlinear self-consistency problem which we solve in an outer loop, we are left with 
only standard problems of linear algebra. The problems relevant for our work can be divided in three 
classes: systems of linear equations (Sec. 13. il l, eigenvalue problems (Sec. l3.2t and matrix exponentiation, 
which is needed in time propagation schemes 1661 . as briefly discussed in Sec. 16. II 

3 . 1 Systems of linear equations 

Large linear systems of equations occur in real-space electronic structure calculations in many contexts. 
In the example calculations of Sec.|5]the discretized Poisson equation, the inversion step in the shift and 
invert mode of the Lanczos method (Sec. 13.211 . the equations of full-response (Eg. I30> and collective 
approximation (|63 1) formulations of the response function method, and the matrix inversion for obtaining 
the Green's function from Eq.|36l (after discretization with the finite-element method) occur. Usually it is 
most convenient to solve such systems with iterative methods [67 1. However, for matrix inversion, where 
the same equation has to be solved many times with different right-hand sides, direct methods are more 
efficient. 

3.1.1 Iterative solvers 

For symmetric positive definite matrices, the conjugate gradient (CG) method is often the standard choice. 
For the dense "matrix-free" problems'^ occuring in the application of the response function method to 
two-dimensional quantum dot problems (Sec. l5.lt the CG method was found efficient even in the absence 
of any preconditioned 

Often, however, it is important to accelerate the convergence of the CG method through precondition- 
ing. Preconditioning the equation Ax = b corresponds to multiplying the equation by an approximate 
inverse B « A^^, resulting in x « BAx = Bb. In our applications of the finite-element method to all- 
electron calculations of molecules (Sec. l5.4> the CG method with preconditioners based on incomplete LU 
factorization (ILU) [68 1 and the multigrid method ||69I I70 71 1 was applied to the linear system of equa- 
tions occuring in the inversion step of the shift and invert mode of the Lanczos method for the eigenvalue 
problem (see Sec. 13.2. iT . 

A multigrid method where the Gauss-Seidel method is used as a smoother, is used as a solver for 
the linear system of equations that results from the FD-discretization of the Poisson equation within the 
MIKA-package <64ll6gl . 

' ' However, this work has thus far been mainly focused on applying the FAS algorithm of Ref. 1531 to the eigenvalue problem 
with fixed potential, and updating the potential in an outer loop. 

It is expected, that similar convergence problems are present in the direct approaches as well. 

We refer to matrices whose action on a vector can be easily implemented as a subroutine, but the storing of whose matrix 
elements in impractical due to their large size, as matrix free. 
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In the axial symmetry, the matrix for the Laplacian within the FD-method is not symmetric. However, 
we have found that our MG-method with the Gauss-Seidel method as a smoother converges. On the other 
hand, the CG-method for the full-response equation |30| is not applicable. The best method we could find 
for this problem was the GMRES method with no preconditioning For the corresponding equation of 
the collective approximation |63| we found no useful iterative scheme. This problem obviously calls for 
further work. A good starting point is probably to take a look at the algorithms available in the Sparskit 
package lISTl . 

3.1.2 Direct solvers 

When solving for the inverse of a matrix as is the case in the calculations employing the Green's function 
(see Sec. l5.5> it is not usually feasible to use iterative methods since the iteration must usually be performed 
separately for each column of the inverse. Instead, direct methods for sparse matrices provide an attractive 
alternative since part of the computational work needs to performed only once per inverse. Modern direct 
methods for sparse matrices are based on the frontal factorisation algorithm of the sparse matrix II721 1731 . 
The algorithm starts with symbolic factorisation of the matrix, i.e. heuristically finding a permutation that 
is intended to minimise the fill-in in the numerical factors. Next, a sparse Cholesky or LU-factorisation is 
computed using block operations with dense BLAS kernels. Finally the problem is solved with backward 
and forward substitutions. If the entire inverse is desired only the final substitution steps must be performed 
for each column of the inverse whereas the matrix factors remain unaltered. Several implementations of 
the frontal method are available as software libraries, e.g. 1741 1751 l76l . 

3.2 Eigenproblem solvers 

Mostly these eigenproblems are Hermitian, so that Lanczos based iterations are the starting point, if one 
looks at the problem from the point of view of a numerical analyst II77I . For physicists the starting point is 
often the preconditioned conjugate gradient method (PCG)'"* [58]. Also nonhermitian eigenvalue problems 
sometimes occur. This happens e.g. when generalized finite-difference discretizations are used (e.g. the 
Mehrstellen discretization of Ref. I29II57I X if the method of adaptive coordinates is applied without extra 
care to guarantee symmetric matrices Il79l l5l. and also when the FD-method is applied in axial symmetry 
L80J . 

In this section we describe those eigenproblem solvers that are used in the example calculations of 
Sec.|5] as well as the residual minimization method (Sec. l3.2.'3t . which is at the core of the GridPaw-code 
(Sec. l4.3> . on which the main line of development within the MIKA-project is currently based on. 

This is not an exhaustive list of eigenproblem solvers. In finite-difference based electronic structure 
calculations also the method of steepest descent with multigrid preconditioning on global grids 1291 1571 
as well as in an almost linear scaling implementation based on localized orbitals (support functions pre- 
sented on a grid) 1131 is used. The highly efficient method utilized in Parsec is based on a taylor-made 
parallel generalized Davidson method |81 1. Recently, a new preconditioned, Krylov-space technique has 
been introduced in the mathematical litterature by A. Knyazev |82 ,83J - this method is claimed to be 
more efficient than its precursors, and certainly deserves to be properly tested in challenging applications. 
Relatively large'' and dense eigenvalue problems of a different structure occur when computing excitation 
energies and oscillator strengths from TDDFT from the Casida equation [84, 851 . Here, the Davidson 
method 1861 1871 is often used 1881 1891 for finding a few of the lowest excitation energies. According 
to Ref. f90l, however, it is also feasible to utilize the Lapack 1501 or ScaLapack 1911 routines for all 
eigenvalues of dense matrices in this case. 

Interestingly, the performance of PCG and Lanczos methods has been compared in the context of plane-wave methods in 
Ref. EH, where it was found that for the diagonaUzation step in a fixed potential Vf,({ of a 900-atom Si-cluster, the Lanczos method 
is about an order of magnitude faster than the PCG method. 

" The dimension of these matrices is small in comparison to the Hamiltonian matrices occuring in typical FD-based calculations, 
but large in comparison to those occuring in typical calculations with atom-centered basis functions 
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3.2. 1 Lanczos and Arnoldi methods 



The Arnoldi Package Arpack 1921 contains an implementation of the Lanczos method as well as its gener- 
alization to nonsymmetric problems, the Arnoldi method. It is used through a reverse communication user 
interface, i.e. the user has to provide his or her own subroutines for matrix-vector products and solvers 
for Unear systems of equations, and Arpack calls these user defined subroutines when needed. Thus the 
efficiency of Arpack actually depends heavily on the efficiency of these user defined subroutines. 

We have used Arpack in the so-called shift and invert mode in the finite-element example calculations 
presented in Sec. 15. 41 The computationally expensive step in these calculations was the solution of a linear 
system of equations, for which the CG-method was applied as discussed above in Sec. 13.1.11 

3.2.2 Rayleigh-quotient multigrid 

The Rayleigh-quotient multigrid (RQMG) method 1931 is originally developed for the generalized eigen- 
problem 

Hu = eBu (20) 

that arises from a FD-discretization of an equation of the form of Eq. Unless generalized finite-difference 
methods (such as those of Refs. I29II30||3T1 ) are used, B = I and H is Hermitian (in fact real and symmet- 
ric). In the general case, it is assumed that B~^H is Hermitian'^, thus the eigenvectors are orthonormal'^ 
Given the eigenvectors {u^; 1 < i < fc}, u^+i is obtained by finding the minimum 

of the functional 



c^r 1 -^fe+i-f^Ufc+i ^ |ufufc+i| 



In the minimization process, a hierarchy of grids is utilized, and on each grid, one degree of freedom is 
varied at a time and the minimum of the functional is found along the coiTesponding line in the space 
of u's. Varying one degree of freedom on a coarse grid corresponds to varying the u values at multiple 
grid points on the fine grid at the same time. More details on the RQMG method can be found from Refs. 
(93l|54||80l|64l|63 . Our implementation of the RQMG method was motivated by the method due to Mandel 
and McCormick f951, which was designed for the solution of the eigenvector with lowest eigenvalue from 
a generalized eigenproblem derived from a finite-element discretization. 



3.2.3 Residual minimization method 

Variants of the residual minimization method of Wood and Zunger [96] are used as eigensolvers in the 
plane-wave code VASP |59 60 1 as well as in the FD-PAW package GridPaw [97J. The basic idea is to 
update the orbitals at each iteration by taking a step along the preconditioned residual PR„: 

u; = u„ + APR„. (22) 

The residual is defined as 



R„ = {H- £„5)u„, (23) 



This is indeed the case for the Mehrstellen discretization introduced in Refs. I29II57I . the authors are not aware if this holds 
for the more general high-order compact (HOC) discretizations of Ref iilj . 

We have also considered a generalization which requires an overlap matrix S here: u^5uj = Sij. Such a generalization 
would be needed within the generalized double grid scheme I64II65I . where the discretization matrices H, B, S are derived through 
"Galerkin transfer" from the FD-matrices B, / on an auxiliary finer level. In a finite element implementation of RQMG, the 
matrices B and S ai'e the same, and both H and S are also Hermitian. 
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where e„ is the present estimate of the eigenvalue computed as the Rayleigh quotient, and the precondi- 
tioner P is discussed below. A is chosen such that the residual at the new guess u'^ 

R; = (ff - £„5)u; = R„ + A(iJ - e„5)PR„ (24) 

is minimized. This amounts to finding the minimum of a second order polynomial in A. In the implemen- 
tation of Ref. L97J . an additional step is then performed by taking also a step of length A in the direction of 
PR' 

u„ ^ u„ + APR„ + APR;. (25) 

In the implementation of Refs. |59 60 1 the residual is minimized in a multidimensional space spanned by 
the present guess u„ and several preconditioned residuals from previous iterations - thus the name RMM- 
DIIS (Residual metric minimization with direct inversion in the iterative subspace). The general idea of 
minimizing a (preconditioned) residual in a multidimensional space spanned by previous residuals is the 
same as in the well-known Pulay (DIIS) mixing schemes I 98ii99ii59ii60ii61j . 

The optimal preconditioned residual would be obtained by solving R.„ = PR„ from {H — eS')R„ = 
R„. In Ref. 1971 one solves instead approximately the simpler equation — iV^Rn = R„ with the aid of a 
single multigrid V-cycle. 

3.2.4 Diffusion algorithm 

In connection with the program package developed in the group of Prof. Eckhard Krotscheck (see 
Sec. \4-.2l the lowest n solutions of the single-electron Schrodinger equation are solved by applying the 
evolution operator, T(e) = e"*^^, repeatedly to a set of states {tpj, I < j < n}, and orthogonalizing the 
states after every step. Instead of the commonly used second-order factorization in combination with the 
Gram-Schmidt orthogonalization, the fourth-order factorization for the evolution operator LIOOJ is used. It 
is given by 

ri^) ^^-yV^-yT^-ieV^-l.T^-ieV ^^-elH+0(e-)]. y ^ V + -e^[V, [T,V]], (26) 

48 

where one operates with the potential energy operators V, V in real-space, and with the kinetic energy 
operator T in the k space. Switching between spaces is expedited via fast fourier transforms (FFT). In- 
stead of Gramm-Schmidt orthogonalization, the Hamiltonian is diagonalized in the subspace of present 
approximate solutions and a new set of orthonormal states is thereby obtained. This step, often referred to 
as a subspace rotation, is in fact an application of the Petrov-Galerkin method of Sec l2.2l with the present 
approximate solutions as basis functions. 

Depending on the physical system this method can be faster by up to a factor of 100 in comparison to 
the second-order factorization. 

4 Six projects 

This paper collects together material from essentially three separate lines of work. One of them is the 
MIKA project (see Sec. 14. 1> . which is in fact an umbrella for a number of activities related to real-space 
methods. The first priority in the MIKA-project is on the expansion of these activities to include also 
time-dependent problems. The MIKA-project has aheady spawned important international collaborations 
with two other projects (see Secs. l4.2l and l4.3> . Another line of work is defined by its aim to promote the 
applications of the finite-element methods in electronic structure calculations. Actual work related to the 
finite-element approach has been performed with two codes - one of them is the well established general 
purpose Elmer package (see Sec. l4.4> developed at CSC, and the other one has been recently developed 
for transport calculations (see Sec. l4.5> . The third and thus far entirely independent line is centered around 
the wavelet approach (see Sec. l4.6> . 
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4. 1 MIKA - a project and a program package 

The finite-difference program package MIKA has been described in Refs. 1941 1641 l65l . Example cal- 
culations using its components are described in those references as well as in Sees. OO andIO 
of this article. The common denominator of these programs is the utilization of the RQMG-method 
(Sec. 13.2.21 Ref. 1931 ) as an eigensolver The MIKA-package can be downloaded freely from the internet 
at http://www.csc.fl/physics/mika/. 

The MIKA-project, on the other hand should not be understood as too tightly bound to the MIKA- 
package. Instead, it is a project for the development of real-space methods in general, and currently the 
main emphasis is on the expansion of the activities to dynamic phenomena described by the time-dependent 
density-functional theory (TDDFT) |3 1. We have chosen the GridPaw-code (Sec. l4.3> as the basis for this 
development, although we, at the same time, monitor the development of the alternative FE-approach based 
on the Elmer-package ('Sec. l4.4> . 

4.2 Diffusion-algorithm and response-function package 

The real-space program package developed at the University of Linz 1621 1 1001 153l IIOII 11021 for solving 
the KS equations can be thought as an alternative to the approach used in MIKA. Instead of using multi- 
grid methods, the eigenvalue problem is solved in this program by using a diffusion algorithm, namely a 
highly efficient fourth-order factorization of the evolution operator (see Sec. 13.2.41 . Secondly, the number 
of self-consistent iterations is significally reduced by applying a response-function method. Within the 
MIKA package, the response-function algorithm has been implemented into MIKA/cyl2 (see Sec. 15. 2> . 
The research applications of the two programs to quantum dot studies have been rather similar, and the 
latest research results summarized in Sec. lS.ll have largely originated from collaborations between the two 
projects. 

4.3 GridPaw 

Also the GridPaw code developed in the Center for Atomic-scale Materials Physics (CAMP) of the Tech- 
nical University of Denmark is a finite-difference program package for the Kohn-Sham equations, in which 
strategies different from those in the MIKA -package have been chosen L97.I . Instead of norm-conserving 
pseudopotentials, the Projector Augmented Wave (PAW) approach ll9l llOI is implemented. A very impor- 
tant technical detail in this implementation is the utilization of the double-grid method 1 16|, discussed in 
Sec. 12.11 Instead of the RQMG method, the residual minimization method with multigrid precondition- 
ing is applied to solve the eigenvalue problem as described in Sec. 13.2.31 Within the second phase of the 
MIKA-project, a close collaboration with the GridPaw project is essential, as described in Sec. 16. II 

4.4 Ekner 

Elmer is a general purpose finite element solver and collection of library subroutines for solving systems 
of partial differential equations by the finite element method. 

Given a finite element partition of an n-dimensional computational domain, the program automatically 
constructs graphs for system matrices in the compressed row storage format, optimizes bandwidth, builds 
up the quadtree search structures and, if required, the restriction and extension operators for projecting 
functions between different levels of finite element spaces in multigrid analysis. 

The program provides interpolation basis functions and their gradients for simplices and n-cubes for 
any given polynomial degree up to ten, and automatically determines the optimal quadrature for numer- 
ically evaluating their inner products. The local element matrices representing the disretized differential 
operators are finally assembled in global structures using the predetermined graphs. The global system 
is solved either with standard multifrontal decomposition methods or Krylov-type iteration schemes. The 
iterative solvers may be preconditioned by standard incomplete decompositions, or multigrid methods. For 
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eigenvalue problems the program utilizes the Arnoldi-method as implemented in the Arpack -package 1921 
in the shift-and-invert mode, with its own linear system solvers applied to the computationally expensive 
invertion-step of the algorithm. 

Basically, the program is well suited for solving problems that can be posed in variational form. The 
classical Kohn-Sham scheme of DFT, for example, fits into this framework extremely well. So far, we have 
performed some preliminary tests for computing the electronic charge density for simple molecules like 
CO and Cgo (see Sec. l5.4> . The results of these all-electron calculations are promising, but there is some 
work that could be done to enhance the overall performance of the code. The major problems are not in 
the finite element discretization itself, but in the performance of the eigenvalue solver, and the relaxation 
techniques needed to avoid the charge sloshing phenomenon of the Kohn-Sham scheme. Here one can 
directly utilize experience gathered using more traditional discretization schemes (see Sec.|3}. 

For more information, see http://www.csc.fl/elmer/. 

4.5 Finite-element method for electron transport 

The program for modeling transport properties of the nanostructures is written in the Laboratory of Physics 
within a co-operation with the Institute of Mathematics both in the Helsinki University of Technology. The 
project is in detail presented in the thesis 1 103 1. 

A nanostructure between leads is a typical electron transport problem. This kind of systems are prob- 
lematic to the DFT with eigenfunction methods, because the system is of infinite size without periodicity. 
We have solved this problem using the Green's function methods with the open boundary conditions. The 
code has three versions: one-, two-, and three-dimensional, so that different types of nanostructures can 
be modeled. The numerical implementation is done using the finite-element method with p-elements up to 
the fourth order We use mesh generator Easymesh 1 38 1 in the two-dimensional calculations, and Netgen 
L14J in the three-dimensional ones. 

4.6 Wavelet package 

The interpolating wavelet package for the computation of the atomic orbitals has been written in the Insti- 
tute of Physics in the Tampere University of Technology in co-operation with the Mathematics Division 
of the Faculty of Technology from the University of Oulu. The aim of the effort is to develop the wavelet 
method for the quantum mechanical applications. Instead of using the orthogonal Daubechies wavelets 
the biorthogonal interpolating wavelets have been utilized in the implementation of the code. Up to now 
the atomic orbitals have been computed for some light many-electron atoms (ions). In the implementa- 
tion we have used the nonstandard operator form, since it provides an efficient method to carry out the 
multi-resolution analysis in the electronic structure calculations. 

5 Calculation examples 

In this section we present example calculations from each of the three lines of work. The MIKA-project 
emphasizes the finite-difference method, and thus far has also been centered on the MIKA-package [64j 
1651 . Applications of this package to quantum dots, surface nanostructures and positron calculations are 
presented in Sections l5 . 1 1 15 .21 and 15 .31 respectively. Note, however, that many of the results reported in 
Sec. l5.1l have been calculated with the response function package described in Sec. 14.21 The introduction 
of the finite-element method to electronic structure problems materializes in Sec. 15.41 where all-electron 
calculations for CO and Ceo ^"e presented, and in Sec 15.51 where transport calculations based on the 
nonequilibrium Green's functions with norm-conserving pseudopotentials are presented. Preliminary re- 
sults of our implementation of the wavelet approach to electronic structure calculations are presented in 
Sec.ESl 
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5.1 Recent real-space calculations on two-dimensional quantum dots 

In this Section we present a brief review on our recent computational results for two-dimensional quan- 
tum dots (QD's) [104]. We consider QD's fabricated at the interfaces of semiconductor heterostructures 
(e.g. GaAs/AlGaAs), where the two-dimensional electron gas (2DEG) is restricted. The many-electron 
Hamiltonian for a QD in the presence of a magnetic field is written in SI units as 

^ = ^ E t-*^^' + ^A(r.)]' + E A^eoe\v,-r,\ + E + ^V^S..,] , (27) 

where the vector potential is chosen in the symmetric gauge to define the magnetic field perpendicular to 
the dot plane. We use the effective-mass approximation with the material parameters for GaAs, i.e., the 
effective mass to*=0.067 nie and the dielectric constant e — 12 A — 13. The external confining potential is 
determined by V^cxt (r) and the last term is the Zeeman energy. 

In the calculations we apply the SDFT in the self-consistent KS formulation. In high magnetic fields we 
have also employed the computationally more demanding current-spin-density-functional theory (CSDFT), 
which does not, however, represent a considerable qualitative improvement over the SDFT. A detailed 
comparison between these two methods for a six-electron quantum dot can be found in Ref. 1 105 1. 

In the numerical process of solving the KS equations we have used both the response-function package 
(see Sec.l4.2> as well as the 2D version of the MIKA package (MIKA/RS2dot). Recent QD applications 
studied using these methods can be found in Secs. l5.1.'T] and l5T!2l respectively. Within the both methods, 
the effective single-electron Schrodinger equation is solved on a two-dimensional point grid. In practical 
calculations, the number of grid points is set between 128 and 196 in one direction. This gives less than ~ 1 
nm for a typical grid spacing, which is sufficient for describing electrons in GaAs. Within MIKA/RS2dot, 
obtaining full convergence typically takes 100 . . . 500 self-consistency iterations. By using the response- 
function methods this number can be typically reduced by a factor of ten. 

5.1.1 Statistics of quantum-dot ensembles 

We have applied the response-function algorithm presented in the previous section to study the statistical 
properties of quantum dots (in zero magnetic fields) affected by external impurities I fD61 IIOTI . In the 
Hamiltonian given in Eq. |^the external potential Vey±{v) consists of a parabohc confinement Vc{r) = 
wCuJ^r^ 12 with and the impurity potential 

47reeo\/(r-Rfc)2+4 



where N^^^ is the number of impurities and and are their (random) lateral and vertical positions in 
the ranges of < < 100 nm and < cife < 10 nm, respectively. For each fixed iVimp = 5 ... 50 we 
apply 1000 random impurity configurations. 

In a noninteracting system the addition energy for a certain electron number N is equal to the eigenlevel 
spacing, i.e., Ao(A^) — eAr/2+i — where the divisor of two follows from the spin degeneracy. We 
have shown that in this case the resulting addition-energy distribution is a combination of Poisson and 
Wigner-Dyson distributions, corresponding to regular and chaotic systems, respectively [1061. 

In the interacting many-electron system the addition energy is given as the second energy difference, 
A(iV) — E{N — 1) — 2E{N) + E{N + 1). The ground-state energies are chosen from the spin states 
with lowest energies. We calculated all the relevant spin configurations for different electron numbers N, 
so that taking the impurity configurations into account, a total number of ^ 10^ self-consistent SDFT 
calculations were performed. We note that this would not have been manageable (in a reasonable time) 
using conventional mixing schemes in solving the KS equations. 
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Fig. 2 (Color online) Total energy for different spin states in a 16-electron quantum dot vs. the sum of the noninter- 
acting level splittings on the fourth energy shell. The number of impurities are A'^imp ~ 5 (a) and Mmp = 20 (b). 
Reprinted with permission from E. Rasanen and M. Aichinger, Phys. Rev. B 72, 045352 (2005). Copyright (2005) by 
the American Physical Society. 

There is an essential difference in the addition-energy behavior of noninteracting and interacting sys- 
tems as a function of Mmp- Namely, when the impurity number is increased the distribution of the inter- 
acting system becomes symmetrical with Gaussian tails. This result is qualitatively similar to what has 
been found in experiments for large QD's 8108111091 . as well as with previous DFT calculations for dis- 
ordered QD's 1 110 111 |. The result demonstrates the necessity for comprehensive treatment of the e-e 
interactions beyond the random matrix theory combined with the constant-interaction model. In our sys- 
tem, the approximate impurity density required for the symmetrization of the addition-energy distribution 
is ~ 10^"^ nm^^, which is of the same order of magnitude as the one used by Hirose et al. 1 1 10| in their 
calculations of disordered QD's. 

We have also analyzed the dependence of the spin states on the impurity density and on the corre- 
sponding noninteracting level statistics 1 107|. Figure 121 shows the ground-state spins {S = 0, 1, 2) for a 
16-electron QD with four impurity numbers A^imp — 5 (a) and 20 (b). The results are shown in a plane 
spanned by the sum of the level splittings on the fourth shell, eio — £7, and the total energy i^tot- The 
fraction of the 5 = 2 states strongly decreases as A^imp is increased. However, the relation of the 5 = 
and 5=1 states gradually saturates toward one. The saturation effect is analyzed in Ref. 1 107 1 in detail. 
In the case of a small impurity density, the resulting spin states show strong statistical dependence on the 
level splittings: when the level spacing is small, partial spin polarization is probable due to Hund's rule. 
This applies especially for 5 = 2 states which are clustered in Fig. |2ja). However, when the impurity 
density is increased the spin-state correlation totally disappears. This indicates that in strongly distorted 
systems the spin of the many-electron ground state can not be predicted from the single-electron spectrum 
due to the complicity of the e-e interactions. 

5.1.2 Quantum Hall regime 

The real-space method based on MIKA/RS2dothas proven to be a reliable and efficient density-functional 
approach to study finite 2D electron systems in magnetic fields |112 113 114 115116). The original 
inspiration for our work is related with the well-known quantum Hall effect in 2DEG II 171 . as well as 
in the electron transport |118| and magnetization fl 19] measurements, which both have revealed a rich 
variety of transitions in the energetics of finite electron systems as a function of the magnetic field. 

We have recently shown using both the SDFT and the exact diagonalization (ED), that in QD's repre- 
ssr\\mg finite-size quantum Hall droplets at strong magnetic fields the off-electron zeros may localize II 121 . 
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(a) ED (b) SDFT 




Fig. 3 Electron densities (gray scale) for two- vortex solutions in elliptical quantum dots (eccentricity = 1.2) cal- 
culated using the ED (a) and SDFT within MIKA/RS2dot(b). Reprinted with permission from H. Saarikoski et al., 
Phys. Rev. B 71, 035421 (2005). Copyright (2005) by the American Physical Society. 



In a parabolic confinement, vortices form stable, clustered configurations with successive transitions be- 
tween them, as the magnetic field (and thus angular momentum) is increased fl 12', "1201. The mean-field 
SDFT densities and the conditional wave functions of the ED results shows similar vortex structures. Fur- 
thermore, QD's defined by a non-circular, e.g., elliptic or rectangular confining potential, exhibit a good 
agreement between the total densities obtained with SDFT and ED calculations, respectively f 1131. Fig- 
ure |3] shows the electron density obtained using the ED (a) and SDFT (b) for two-vortex solutions in 
elliptical QD's. In the SDFT results the vortices are strongly localized with the electron density close to 
zero in the core, which reflects the fact that the mean-field method is not able to include all the corre- 
lations of the exact many-body state. We have found similar differences between the ED and SDFT for 
the so-called giant-vortex states, which are formed if a quartic term (flatness) is included to the otherwise 
parabolic potential [1151 . 

The state transitions in the quantum Hall regime are also visible as oscillations in the QD energetics, 
e.g., in the chemical potentials fJ,{N) = E{N) — E{N — 1) (Ref. 11141 ). At magnetic fields below the 
total spin polarization we have found finite-size counterparts of the integer and half-integer quantum Hall 
states, as well as a developing pattern of de Haas-van Alphen oscillations in the magnetization M{N) = 
—dE{N) jdB (Ref. 11 161 ). These results are qualitatively consistent with the experimental magnetization 
data of large dot arrays \\ 191. At higher magnetic fields the signatures of above discussed vortex formation 
inside the QD are clearly visible in the magnetization oscillations, and the agreement between the SDFT 
and QMC results is good. These oscillations should be observable in accurate magnetization measurements 
of QD devices. 

5.2 Surface nanostructures studied with MIKA/cyl2 

5.2.1 Calculations in axial symmetry 

Many interesting nanostructures, such as adatoms on the surfaces, circular quantum dots and quantum 
corrals can be modelled in axial symmetry. Numerically the problem is reduced from 3D to 2D which 
makes the calculations drastically easier, and allows modelling of much larger systems. 

In an axially symmetric potential the Schrodinger equation written in the cylindrical coordinates is 
separable. The Kohn-Sham orbitals are given as a product of the angular and (r,z) -dependent parts, i.e.. 



^mfe„(r) -e""^[/„fe„(r,z). (29) 

The formalism is nice from the computational point of view as the eigenstates with different m and k are 
automaticaUy orthogonal. This allows the problem to be splitted into various subproblems and we can take 
the full advantage of massively parallel computing environment. 



© 2006 WILEY- VCH Verlag GmbH & Co. KGaA, Weinheim 



pss header will be provided by the publisher 



21 



It is noteworthy that we can make use of this special form of wave functions also in the response-function 
scheme. The density correction is determined from Eq. (2.5) of Ref. 1 62 1, which reads after writing out the 
dielectric function, 



ApW(r)-V^)(r)^2^ ^P™^''^ / dVdVVp(r')V'.(r')14.-/.(r', r") V'Hr")- (30) 



Inserting (I29> and rewriting the integrals in the cylindrical coordinates we see that the integral over r' 
happily vanishes unless the wave functions share the same angular momentum quantum number This 
drastically reduces the number of terms in the above summation which is welcome when one tries to 
solve the linear equation using an iterative method such as conjugate gradient or GMRES where efficient 
calculation of the matrix operator is essential. Furthermore, "particle-hole interaction" Vp-h can also be 
crudely approximated by the Coulomb part of the effective potential (remember that it does not affect the 
final result, only the speed of the iteration process), i.e., Vp-h{v, r') « ^^^ry- Thus the integration over 
double primed coordinate gives just the electrostatic potential "caused" by the charge distribution 5p'^^^ (r'). 



so that the time consuming second integration can be circumvented by simply solving the corresponding 
Poisson equation, V^F(r) = —Att5p{y), that can be done efficiently in the axial symmetry. With the 
response-function scheme we have been able to solve self-consistently electron systems consisting of over 
10000 Kohn-Sham orbitals in approximately 100 cpu hours with IBMpower4 processors. 

5.2.2 Cu(ll 1) surface and structures 

The Cu(ll 1) surface is an example of a crystallographic cut of the material that places the Fermi energy of 
electrons propagating normal to the surface inside the bulk bandgap. Conduction electrons on the surface 
have energy close to the Fermi energy and are sandwiched on the surface, unable to escape into the vacuum 
because of the potential barrier, and forbidden to enter the bulk because of the band gap. They can however 
move parallel to the surface and thereby form a kind of two-dimensional electron gas. Adatoms deposited 
on the surface can affect the surface electron distribution, which offers interesting possibilities to study the 
properties of confined electrons, their interaction with adsorbates and many-body physics in general. 

We have used the MIKA/cyl2 software to study axial symmetric surface structures such as single 
adatoms and circular quantum corrals. The Cu substrate in our calculations is described by a ID model 
potential; the Cu(lll) planes, perpendicular to our symmetry axis (z), are considered to have uniform 
density, while in the z-direction we have an oscillating periodic potential that mimics the periodic struc- 
ture. This particular model was proposed by Chulkov et al. |121 1. In addition to the work function and 
the bandgap, the model potential produces correctly the Shockley surface states and the image-potential 
states. Previously, this model has been successfully used e.g. in studies of dielectric response-functions 
and lifetimes of excited states 1,122.1, and electron confinement in a metallic slab on solid surfaces within 
ID self-consistent DFT calculations 11231 . Once the surface is constructed, suitable pieces of jellium can 
be added to mimic various kinds of adsorbates. 

So far we have studied e.g. the behaviour of surface state wave functions when a single Pb-adatom 
is deposited on the surface. On the plain surface the state is delocalized and has a constant amplitude 
along the surface, where as with Pb-adatom, it is localised in the Pb adsorbate (Fig@}. In a more complex 
case we have examined surface state electrons inside a circular ring of adatoms deposited on the surface. 
These kind of quantum corrals have been extensively studied experimentally and theoretically within the 
scattering theory, where oscillations on the surface LDOS have been observed |1241. Our calculations 





(31) 
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Fig. 4 Two example calculations with MIKA/cyl2. Left: modulus of the first surface state wave function localized at 
the single Pb adatom deposited on the Cu(l 11) surface. Jellium profile of the adsorbate is pictured with the rectangular 
box. Right: 3D picture of the oscillations of the surface electron density inside a small, circular quantum corral. 

represent a somewhat different viewpoint as the LDOS is calculated starting from the Kohn-Sham orbitals 
instead of the scattering Green function. 



5.3 Positron calculations with MIKA/doppler 

The use of positron annihilation in defect studies is based on the trapping of positrons from a delocaUzed 
bulk state to a localized state at the defect (see Fig. |5^). The trapping is due to the reduced nuclear 
repulsion at the open-volume defects. Because the electronic structure seen by the positron at the defect 
differs from that in the perfect bulk crystal the annihilation characteristics change. The positron lifetime 
increases because the average electron density decreases. For the same reason the momentum distribution 
of annihilating electron-positron pairs becomes more peaked at low momenta. However, the positron 
density may sample the different atomic species of a compound material with different relative probabilities 
in the bulk and at a defect. The defect may be surrounded by impurity atoms. In these cases the high- 
momentum region of the distribution, which is mainly due to annihilation with core electrons, reflects the 
chemical structure of the defect. The changes in the bond structure between the atoms neighboring the 
defect may also affect the low-momentum part of the distribution. In order to understand these changes 
and fully benefit from them in defect identification, theoretical calculations with high predictive power are 
indispensable. 

The description of the electron-positron system can be formulated as a two-component density-functional 
theory LI 25. 1 . In the measurements there is only one positron in the solid sample at a time. Therefore, the 
density-functional scheme has to be properly purified from positron self-interaction effects. Comparisons 
between theoretical two-component DFT results and experimental results have shown that the follow- 
ing scheme is adequate. First, the electron density n(r) of the system is solved without the effect of 
the positron. This can be done using different (all-electron) electronic structure calculation methods. A 
surprisingly good approximation for the positron lifetime and core-electron momentum calculations is to 
simply superimpose free atom charges. Then the potential V+(r) felt by positron is constructed as a sum 
of the Coulomb potential 0(r) due to electrons and ions and the so-called (electron-positron) correlation 
potential Vcorr(r) which is treated in a local density approximation (LDA). In the zero-positron-density 
limit it is a function of the electron density n_ (r) only, i.e. 

y+(r) =0(r)+Korr(ri-(r)), (32) 
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The ensuing single-particle Schrodinger equation can be solved using similar techniques as used for the 
electron states. For example, we use the three-dimensional real-space Schrodinger equation solver of the 
MIKA package. When using self-consistent electronic charge density we use also the Poisson equation 
solver of the MIKA package to calculate in Eq. i32\ the Coulomb potential due to valence electrons. 

The scheme described above is for a delocalized positron the exact zero-positron-density limit of the 
two-component DFT. However, the approximation made in the case of a localized positron can be justified 
by considering the positron and its screening cloud as a neutral quasiparticle which does not affect the 
average electron density. 

When the electron density n(r) and the positron density n+(r) = |i/)+(r)p are known the positron 
annihilation rate is calculated within the LDA (in the zero-positron-density Umit) as an overlap integral 

X = — ~ nrlc J dr n+(r)n_(r)7(n_(r)), (33) 

where Tg is the classical electron radius, c the speed of light, and 7 the enhancement factor taking into 
account the pile-up of electron density at the positron (a correlation effect). The inverse of the annihilation 
rate is the positron lifetime r. 

We calculate the momentum distribution of the annihilating electron-positron pairs using the so-called 
state-dependent enhancement scheme 1 126| as 



pip) ^ TTrlcJ2lj [ dre-'P-"-^+(r)V, (r) 

3 



(34) 



where the state-dependent enhancement factor is written as 7^ — Aj/Aj^'^. \j is the annihilation rate of 
the state j within the LDA, 

Aj = Trr^c j drn+(r)nj(r)7(n_(r)), (35) 

and Aj^'^ is the annihilation rate of the state j within the independent-particle model (IPM, 7 = 1). 
Above, nj{r) = |V'j(r)P is the electron density of the state j. One can directly compare computational 
results with experimentally measured Doppler broadening of the 511 keV annihilation line or with exper- 
imentally measured angular correlation of the annihilation gammas. We have found that the use of the 
commonly employed position-dependent enhancement factor [enhancement taken into account with the 
factor •\/7(7i_(r)) inside the Fourier transform in Eq. (I34H leads to unphysical results when one compares 
the ratio of two Doppler spectra with the experiment 1 127 1. 

The MIKA/doppler program uses the atomic superposition method. One can either use the LDA 
parametrizations (enhancement factor and correlation potential) by Boroiiski and Nieminen 11251 or the 
gradient-corrected scheme by Barbiellini et al 11281 1129 ^. However, the atomic superposition method 
cannot be used for the low-momentum part due to valence electrons. For that purpose self-consistent 
all-electron valence wavefunctions have to be constructed. For example, we have used the projector 
augmented-wave (PAW) method implemented in the plane-wave code Vienna Ab initio Simulation Pack- 
age (VASP) 1 59 60 130 131 127|. Recent applications of the real-space solvers of the MIKA package to 
positron studies include, for example, a study of vacancy-impurity complexes in highly Sb-doped Si 11321 
in which computational Doppler spectra were used to identify experimentally detected unknown vacancy- 
type defects and and a study of the effects of positron localization on the Doppler broadening of the 
annihilation line in the case of Al 1 133 1. 

Figure|5t shows an example of a positron state calculated for a Ga vacancy in GaN. The corresponding 
Doppler spectrum calculated with the MIKA/doppler program is shown in Fig. |5j)- The figure shows 
also the decomposed momentum distributions corresponding to annihilations with electrons on different 
orbitals. Their weights are calculated with Eq. (I34> . 
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a) 




40 



Fig. 5 a) An isosurface of the positron wave function localized at a Ga vacancy in GaN. The Ga atoms are shown by 
green and the N atoms by blue spheres, b) The total and decomposed Doppler spectrum for the Ga vacancy in GaN 
calculated using the MIKA/doppler program. The contributions of the individual orbitals of the Ga and N atoms are 
shown by colored lines. 



Fig. 6 Some electronic structure calculations using Elmer. All-electron calculations for the molecules CO and Ceo 
were performed. Implementation of periodic boundary conditions was tested in the case of bulk silicon. See also the 
text. 

The MIKA/doppler program also enables one to calculate the forces on ions due to a localized positron 
within the atomic superposition approximation. These forces can be used together with electron-ion 
and ion-ion forces calculated with standard methods in order to find the relaxed ionic configuration of 
a vacancy-type defect at which there is a locaUzed positron 11271 . 

5.4 All-electron finite-element calculations 

Although the main emphasis of the MIKA-project has been thus far on the support and development of the 
MIKA-package, which is based on finite-differences, it has been interesting to perform some simple test 
calculations using the general purpose finite-element package Elmer ('Sec. l4.4> . Some of these calculations 
have been already reported in Ref . II80I . 
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Fig Elillustrates three test cases. As the first self-consistent calculation within the local-density approx- 
imation of the density-functional theory using Elmer, an all-electron calculation for the carbon monoxide 
molecule was performed. The top left panel of Fig. |6l illustrates the main features of the selected finite- 
element mesh'*. Since the divergent (1/r) potential has to be represented on the mesh, a very fine mesh is 
used in the immediate neighbourhood of the singularity. The mesh was generated using Netgen 1 14 |, con- 
sisted of 120 000 quadratic elements and 170 000 node points. Each node point corresponds to one basis 
function. To solve the Kohn-Sham eigenvalue problem, the Arpack package was utilized, with incomplete 
LU-factorization 1 68 1 as the preconditioner within the computationally expensive inversion part of the shift 
and invert step of the Lanczos method as discussed in Sec. 13.2.11 and Sec. 13.1.11 The initial guess for the 
effective potential was the sum of the bare nuclear potentials. We used an ad hoc linear mixing scheme, 
where the effective potential at each iteration was a linear combination of the input and output potential, 
with an exponentially decreasing weight for the input potential, resulting in fast convergence once the de- 
cay parameters were properly adjusted. In the middle panel of the upper row of Fig.|6la contour plot of a 
single-particle wave function provided by the Kohn-Sham scheme is shown. Also larger molecules can be 
treated using this all-electron scheme within Elmer. The top right panel of Fig. |6l shows a selected orbital 
from an all-electron calculation of the Cgo -molecule. The lower left panel illustrates the electron density of 
Cgo, obtained from a parallel calculation involving eight processors and 4 x 10^ degrees of freedom. In this 
calculation the preconditioner used in the CG-method of the inversion step was a multigrid method. The 
lower middle panel of Fig.|6lillustrates the partitioning of the mesh that was used - domains with different 
colours were mapped to different processors with the help of the Metis program [1341. Periodic boundary 
conditions were implemented within Elmer, and in order to compare the computational efficiency of the 
RQMG method and the Lanczos method, the Schrodinger equation was solved (i.e. a non-self-consistent 
calculation was done) in a periodic potential corresponding to bulk silicon using both methods. A uniform 
grid consisting of 32'^ points was used in the 64-atom supercell. The computational efficiencies of the two 
methods were found to be similar The lower right panel of Fig.|6|illustrates a selected eigenfunction from 
this periodic test case. 

The calculation on carbon monoxide reported above, originally performed in 2002, was revisited in 
2005, taking advantage of the novel mesh generator GiD II35I . This time only 28 000 second order ele- 
ments (39 000 basis functions) were required. The accuracy of the computation was checked by evaluating 
the dipole moment of the molecule, which is a well known basis-set sensitive quantity. The result — 0.235Z? 
was obtained, whereas the basis-set limit for this quantity is — 0.226D II36I . 

There is plenty of room for improvement in the computational efficiency of our Elmer-based solver, 
and the next steps required seem to be straightforward. Implementing norm-conserving pseudopotentials 
or PAW, gathering experience on making meshes with higher-order p-elements, possibly with the order 
variable in space (these are now available in Elmer) as well as paying more attention on the choice of the 
eigenproblem solver and mixing scheme will improve the efficiency considerably. 

5.5 Ballistic transport calculations using the Green's function method 

In this section we present calculations of finite element implementation for the transport properties of the 
nanostructures. 

In the DFT the electron density of a system is constructed from eigenfunctions of the single-particle 
Hamiltonian. For isolated systems, such as molecules or quantum dots, the whole system can be included 
in the calculation volume. However, this is not possible in general and some approximations have to be 
done. In the case of bulk materials the systems are treated as infinite by using periodic boundary conditions. 
In transport problems the nanostructure modeled is connected to two or more leads so that the system size 
is infinite but does not have periodicity. A solution to this problem is to calculate the electron density 
from the single-particle Green's function with the open boundary conditions 1 137|. This makes finite-size 

'** To be precise, the mesh in the picture is a two-dimensional mesh and not a cutplane through the three-dimensional mesh, as 
we were not able to draw such a cutplane 
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effects small. The Green's function method has also other useful properties in transport calculations. For 
example, it is possible to apply a bias voltage across the system and calculate the cuiTent through it. 

Green's functions in transport calculations are heavy to calculate numerically which has delayed their 
first-principles implementations until the recent years. The computational methods have to be chosen 
carefully. Typically one uses a special basis set, such as atomic orbitals |138 139 1, tailored to describe 
the equilibrium electronic properties of the system in question. Because a systematic error control is not 
straightforward with atomic orbitals, the development of other types of solvers is well motivated. We have 
implemented the Green's function transport formalism using the FEM with p-elements up to the fourth 
order polynomials [140J . The solver has one, two, and three-dimensional versions so that different types 
of nanostructures can be modeled. The other implementations beyond the atomic-orbitals ones employ 
an 0{N) optimized basis [1411, a wavelet basis |142|, full-potential linearized augmented plane-waves 
|,143J , maximally localized Wannier functions LI 44 J . a finite-difference method L145J . and a finite-element 
method fith Unear basis functions 11461 . 

i 
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Fig. 7 Schematic sketch of modelling a nanostructure. 

A brief introduction to the Green's function method is based on the schematic sketch in Fig. The 
calculation region is divided into the three parts ftji, il, and il^. The nanostructure is located in ft and it 
is connected to the two leads ftji and il l using the open boundary conditions on dH, ^ . This means that 
electrons can travel through the boundaries dH. l /r without reflection. The system is also open in the sense 
that during the iterations towards the self-consistent solution the number of electrons in fl is not a constant. 
On the boundaries 9$lpi/p2/p3/p4 Dirichlet's or periodic boundary conditions are applied. 

In the Green's function method the electron density is calculated from the retarded Green's function G^. 
This requires first the solving of the equation 

{oj + - V^eff(r)) G^r, r'; c.) - ^(r - r'), (36) 

where uj is the electron energy and Vcb is the effective potential in the system. The separation between the 
two solutions G^ and G" is made using the boundary conditions. In practice, most of the calculation time 
is spent in solving the corresponding equation. The calculation of the electron density requires integration 
over the electron energy so that G^ has to be calculated at several energy values. However, when the 
integration path is moved to the complex plane, the numerical integration does not require many points. 
Moreover, the integration is easy to parallelize. 

In transport calculations we are interested in the electron current. In the Green's function formulation 
the current is calculated using the Landauer type of formulation 

I= - f T{lo) - /p(a;)) du;. (37) 
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Note that the above tunnehng probabihty T{uj) is not the real probability in the sense that it can be larger 
than unity because it includes the summation over the different conducting channels. A channel can con- 
tribute to the conductance by one conductance quantum at maximum. Similarly to the electron density, T 
is also calculated using the function. 

In order to use the FEM we first write the equations of the Green's function method in the so-called 
variational form. This is how also the numerical form for the open boundary conditions can be derived. 
The result is analogous to the truncated matrix derivation used, for example, in the case of the finite- 
difference method 1 147 1. First we take a nicely-behaving arbitrary function i;(r) and multiply both sides 
of Eq. i36\ by it. Then we integrate over the calculation domain We use the properties of the open 
boundary system and make some manipulations 11481 . which are shown in detail in Ref. 11491 . After this 
the equation takes the form 

/ I - Vi;(r) • ivG'-(r, r'; + v{r) [uj - Kff(r)] G^{r, r';co)}dr 

Jn^ ^ ' (38) 

-(^:LG•^^;)-(^]J^G^^;)= v{v'), 

where the so-called self-energy operators of the leads have the form 

f l^r/ / / , d'^9e{rL'/R',rL/R;u;) (39) 



/ 
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The open boundary conditions of the system are included in the self-energy operators. They are the surface 
integrals over the open boundaries dflj^/ji. The function gei^^L' /r' :T^l/r\ ^) is the Green's function of 
the isolated lead 57^//^ with the zero boundary condition on the boundary dili^fn \ 148 1. In this equation 
9e{^L' /R' 5 /Ri t^) is differentiated with respect to both arguments in the direction of the normal vector 
T^L/R on the surface dflL/jf . 

The solution for G"' can be calculated froml38lbv using the finite-element approximation {r , r' ; ui) « 
d^ji^) '/''(r) '/>j(r') and choosing v{r) = (f>p{r). 

The calculation of the C function requires inverting of large matrixes. We use a direct sparse solver 
for this purpose 1.74.1 . A direct solver routine requires more computer memory than iterative methods. 
However, when one has to solve a lot of linear equations with the same coefficient matrix but different left 
hand sides direct routines are more efficient than iterative ones. 

We have used our FEM implementation for modeling of two-dimensional quantum point contacts II150I . 
Na atomic wires II140I . and thin Hf02 layers 11401 . 



5.6 Solution of atomic orbitals using interpolating wavelets 

Wavelets have been used for solving partial differential equations and the electronic structure, in particular, 
only recently I151lll52ll44lll53l . Most authors have chosen compactly supported orthonormal wavelets. 
Daubechies wavelets fl54"1531, Meyer wavelets [1551, and Mexican hat wavelets fT52| have been used. 
Three of the authors of the present paper performed electronic structure calculations for hydrogen and 
some many electron atoms with interpolating wavelets 11561 . Goedecker and Ivanov used interpolating 
wavelets to solve the Poisson equation 1 157]. 

Orthonormal wavelet families provide useful properties like recursive refinement relations and they lead 
to fast discrete wavelet transform for multire solution analysis. Wavelet methods are closely connected to 
point-grid based methods that also generalize to higher than one dimensions 1 158II159I . Lippert et al. 11601 
have used interpolating wavelets in point-grid based methods. 

Representation of operators in orthonormal wavelet bases has also been studied 11611 11621 11631 and 
we have introduced an algorithm to compute the standard operator form of an arbitrary operator from its 
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Fig. 8 Example of a nanostructure transport problem, i.e., a chain of three Na atoms. The left-hand side shows 
constant electron density surfaces against the backbone of the Na pseudoatoms (red spheres). The bulk electrodes are 
described using the jellium background charge. The right-hand side gives the tunneling probability as a function of the 
electron energy. The energy zero (dashed line) corresponds to the Fermi energy. 




23456789 2345678 
L resolution levels (L) 

Fig. 9 (a) Relative error of the hydrogen Is orbital eigenvalue, unit of length u — 1.0. (b) Energy eigenvalues of 2s, 
2p, 3s, 3p, 3d, and 4f orbitals of hydrogen with B — 15, 20, 30, 30, 30, and 45, respectively. Ratio of the computed 
and exact values is shown. Reprinted with permission from T. Hoynalanmaa et al., Phys. Rev. E 70, 066701 (2005). 
Copyright (2005) by the American Physical Society. 



nonstandard operator form. Nonstandard operator form decouples different resolution levels, which turns 
out to be an important aspect for numerical approaches. 

Due to spherical symmetry of atoms the three dimensional single-particle equation Hijj = eip of the 
hydrogen-like atoms is separated to the one dimensional radial part and the two dimensional angular part. 
Exact solutions to the latter are spherical harmonics Yime where £ and rrii are the orbital quantum number 
(angular momentum quantum number) and orbital magnetic quantum numbers, respectively. Writing i?„£ 
for the radial wavefunction the hydrogen-like orbitals are 4'nimi {r, 0, (f>) = Rni{r) Ygmi {S, 4>), where n is 
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Fig. 10 (a) Computed total energies for helium, beryllium, neon, magnesium, and argon. Ratio of computed and 
accurate 11641 values is shown, (b) Computed total energies for lithium and sodium. Ratio of computed and accurate 
1 164 1 values is shown. Reprinted with permission from T. Hoynalanmaa et al., Phys. Rev. E 70, 066701 (2005). 
Copyright (2005) by the American Physical Society. 



the principal quantum number. The radial Schrodinger equation for a hydrogen-like atom is 

1 d2 Z £(^+1)" 



2dr2 r 2r2 



Pni{r)^eniPr.i(r), (40) 



where Z is the atomic number, Sni are the orbital energies, and Pnt{r) — ri?„f (r) are the radial wavefunc- 
tions multiplied by r. Functions Pne{r) are called wavefunctions, too. In general, the radial wavefunction 
Pni has 71 — ^—1 nodes 1 165 1. 

The Schrodinger equation H'ii = E'i' of a many-electron atom leads to Hartree-Fock equations in 
the central field approximation and with Slater determinant wavefunctions I166lll65lll67l . The Hartree- 
Fock equations for many-electron atoms are presented e.g. in our previous article ['1561. We omit the 
nondiagonal Lagrange multipliers and HF equations can be written as matrix eigenvalue equations L168J 



Fid = SiCi. (41) 

The basis sets used for the computations were formed so that there are S type basis functions 
'^'l"2B J ■ • ■ J resolution level fcmin and and D type basis functions . . . , V's-i for 

k = fcmin, • ■ ■ , fcmin + L — 2. Here L is the number of resolution levels and i? is a parameter describ- 
ing the number of basis functions in each resolution level. 

Relative error of the hydrogen Is orbital eigenvalue is plotted in figure |9] (a). Here the computation 
parameters L and B are varied. In figure|9l(b) the energy eigenvalues for several excited states of hydrogen 
are plotted. Here the number of resolution levels L is varied. Results for HF computations for closed-shell 
atoms are presented in figureQo|(a) and for open-shell atoms in figure[Tol(b). 

To summarize, we have demonstrated that interpolating wavelets can be successfully used to solve the 
atomic orbitals and the electronic structure of atoms. We were able to systematically increase the accuracy 
of the calculations by choosing the number of resolution levels and the number of basis functions at each 
level. This kind of flexibility is an important benefit of wavelets. The numerical results were found to 
converge to the exact ones as the number of resolution levels increases. However, with a large number 
of resolution levels the needed computation capacity increases considerably. A noticeable feature in our 
development for the Hartree-Fock formalism is that all relevent operators can be evaluated analytically. 
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6 Future developments 

In the following we list of our future plans with regard to the further development of software based on 
each of the three discretization methods. This is done in order to shed light on the motivation behind the 
work reported in Sec. 5 and to stimulate discussions and collaborations. 

6.1 Finite differences 

The emphasis on the MIKA-project will be in the development of the finite difference method within the 
GridPaw-code. The main scientific objective in the near future is the implementation of time-dependent 
density functional theory (TDDFT) both in the linear response limit and in the real-time domain. These 
formulations can be used, for example, for the calculation of optical absorption spectra and band gaps of 
semiconductors and insulators. The real-time formulation is needed for non-linear phenomena such as the 
electron dynamics under strong femtosecond laser pulses. Recently, a useful survey of time-propagation 
methods has been publihed by Castro et al. |66 1. 

The time dependent DPT problems are often solved using operator splitting methods. This means that 
the Laplacian part is transfeiTed to Fourier side with FFT and propagated there, while the potential is 
exponentiated directly in real space. Mostly second or fourth order splitting schemes are used. This is in 
fact completely analogous to the diffusion algorithm of Sec. 13.2.41 which can be seen as propagation in 
imaginary time. 

Recently, the so called exponential integrators have turned out to be very effective, especially in paraboUc 
problems [1691. These integrators use the exponential of the linear approximation. The development to 
hyperbolic systems, as TDDFT, is under active study. An interesting option for equation x' — f{x) is, for 
example, the following exponential explicit midpoint rule 1 170 1 

x'^+i = x'' + e^"{x''-^ ^x^) + 2h <P{hH) fix'') , (42) 

where H is an approximation of f'{x'') and (t){z) = (e^ — 1) /z. This produces a time reversible propagator, 
with good norm and energy preservation properties. Computations of e'^^v will be done using Krylov 
subspace techniques. Efficient preconditioning for these, is under active study. Also variants of the use of 
Magnus expansions for the exponential seem to be promising 1 171 1. 

The search for new, better exchange correlation functionals is an ever-growing quest within DFT. One 
problem in the standard LDA and GGA functionals is that part of the self-interaction contribution of the 
Hartree- and Coulomb-potentials remain in the total energy functional. One result of this remaining self- 
interaction contribution is that in finite systems the potential decays exponentially instead of the correct 
1/r behaviour In contrast to the LDA, the Hartree-Fock scheme is self-interaction free and has the correct 
asymptotic behaviour One approach to improve from the LDA is the so-called optimized potential method 
(OPM) where a multiplicative potential is constructed from a Hartree-Fock-like exchange energy functional 
1 172 173 1. This potential, which depends explicitly on the Kohn-Sham orbitals, has to solved from a 
complicated integral equation. We are going to implement this exact-exchange scheme by using the KLI- 
approximation 1 174| for the OPM integral equation. 

6.2 Finite element methods 

Even though finite differences methods are currently the most well established in the context of the elec- 
tronic structure calculations, it is clear that FE-methods could provide certain advantages. Thus, further 
assessment of the feasibility of FE methods in electronic structure calculations is one of the objectives of 
the current MIKA-project 

In this section we briefly summarize what we already have and what is still missing. All-electron 
ground-state calculations for molecules can be performed today with Elmer, as described in Sec. 15.41 With 
the commercial mesh generator GiD significant reduction in the number of second order basis functions for 
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a high quahty computation of the carbon monoxide molecule was observed (Sec. l5.4t in comparison with 
the 2002 version of Netgen. Interestingly, both of these discretizations yielded smaller matrix problems 
than a state-of-the-art finite-difference pseudopotential approach (see Sec. l7.2> . and also computations of 
larger molecules (Cgo) were shown to be feasible. Periodic boundary conditions have been implemented. 
Further improvements of efficiency utilization of /ip-elements, pseudopotentials (or PAW) and fine-tuning 
of the eigenproblem solver and mixing scheme. On the other hand, if we aim at an efficient method for 
real-time TDDFT calculations, the performance (in terms of cpu-time) of the eigenproblem solver and 
mixing scheme becomes less critical, whereas the reduction of the size of the Hamiltonian matrix becomes 
even more important. 

An alternative route to a general purpose FE-package may be provided by the code described in Sec. 15. 51 
This program already includes an implementation of norm-conserving pseudopotentials. This implemen- 
tation can possibly be adapted to Elmer, or an eigenproblem solver can be added into the transport code. 

The question of the feasibity of molecular dynamics with an unstructured nonuniform FE-mesh is often 
raised. The FE-community provides the following suggestion, based on experience in other fields: each 
atom is attached to the mesh, and the mesh is considered to be made of "rubber". As the atom moves, the 
mesh moves along, but only within a finite range, and towards the boundary of the range the mesh moves 
less. Therefore, the basis functions are indeed functions of atomic coordinates and Pulay forces occur. If 
the atoms move over large distances, they may "tear the mesh apart" locally, by making certain corners of 
tetrahedra too sharp. In this case local remeshing is applied. In this context, it should be also pointed out 
that successful molecular dynamics simulations using structured nonuniform finite-element meshes (in so 
called adaptive coordinates) have been performed already by Tsuchida 1 20 1 . As a final remark we add, that 
the possibility to use nonuniform unstructured meshes within FEM is a very useful, but optional feature. It 
is also possible to work with uniform meshes, and indeed implementation of molecular dynamics is easier 
in that case. Even a uniform high-order FE-mesh is expected to be more efficient than a uniform FD-grid 
- at least in the simplest case where no interpolation tricks are applied on the FD-side - in terms of the 
required number of basis functions (or grid points) for a given accuracy. 

6.3 Wavelet package 

The wavelet based approximation scheme is still quite recent computational tool in electronic structure 
calculations even though it has shown to be very successful in other applications in science and engineering. 
Its use so far has been mostly restricted to cases which reduce to one-dimensional problems and whose 
solutions by other methods are well known in the literature. However, a promising recent development 
should be noticed in this context 1 1751. 

In near future we will continue the investigation to which extent the wavelet method can speed up the 
computation of the atomic orbitals. Especially we will study two and three-dimensional problems. We are 
currently investigating the use of interpolating tensor product wavelets for these calculations. Special kind 
of nonseparable wavelets may be better for this purpose. They allow achieving the full benefit of the fast 
evaluation of the discrete wavelet transform. Also the methods will be compared with the more traditional 
solution methods. 

7 Discussion 

In this section we compare the different methods of discretization from various points of view. The discus- 
sion is biased towards comparing the EE and FD methods to each other, the wavelet approach is not equally 
well represented here due to the lack of experience on wavelets of the authors of the present section. 

InSec.O we discuss the similarities and differences of the discretization methods, emphasizing aspects 
of local refinements, ease of implementation and linear algebra. In Sec. 17.21 a quantitative comparison 
between the sizes of the resulting matrix problems in a pseudopotential FD method and an all-electron 
EE-calculation for CO is made. Also some speculation based on FE-results from a transport calculation 
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with pseudopotentials is presented. In Sec. l7.3l we present an awkward transition (in latin: pons asini) from 
the double grid method of Ono and Hirose 1161 to the state-of-the art finite-element method. 

7.1 Similarities and differences between the methods of discretization 

The three methods for the discretization of the eigenvalue equation considered in this paper have significant 
similarities but also some differences. The most notable difference is that in the finite-element and the 
wavelet method a basis set is present whereas the finite-difference method relies on representing functions 
by their values at grid points in space. This difference has two main consequences. 

1. The use of a basis set gives more freedom for constructions with varying spatial resolution. It is also 
easy to take into account details in geometry and general forms of boundary conditions. On the other 
hand, one must be able to generate non-uniform meshes conforming to details in geometry in three 
dimensions. 

2. The basis set nature of the finite-element and the wavelet methods allows the use of variational argu- 
ments in the development of the method. This decreases the regularity requirements imposed on the 
solution and leads to a simpler error analysis. 

Let us discuss the question of varying spatial resolution a bit more closely. In the finite-difference 
context the main approach is the method of adaptive coordinates EH IT76J This method involves a 
mapping from a regular grid in three-dimensional parameter-space to a curvilinear coordinate system in 
real-space. This approach obviously has some limitations with regard to the ratio of smallest and largest 
local grid-spacing, and at least cannot be regarded as a promising approach for all-electron calculations 
of heavy elements. Special care has to be exercised in order to obtain symmetric finite-difference matrix 
representation of the Laplacian in adaptive coordinates ll79l ISj. Nonsymmetric matrices would render 
standard iterative methods from linear algebra, such as the conjugate gradient method, or Lanczos method 
for matrix exponentiation required in time-propagation schemes 1661 unreliable. On the finite-element 
side, an implementation of adaptive coordinates has been introduced as well |6|, but the pure FE-approach 
involves instead general unstructured meshes consisting of elements of various, often tetrahedral, shapes 
(see Sec. 12. 3> . This approach allows greater spatial variations in the element size, granting accurate all- 
electron calculations with problem sizes much smaller than the coiTesponding FD-grids in pseudopotential 
calculations (see Sec. The resulting matrices (H, S, A) in finite-element methods are symmetric by 
construction. For the sake of completeness, we mention that another approach, namely that of composite 
grids 117711178 1, has been introduced in the context of finite-differences for varying spatial resolution. In 
this method, local patches of finer grids are applied. In the wavelet approach, local refinements are applied 
in a similar way 11591 . 

From an algorithmical point of view the finite-difference psedopotential method with uniform grid and 
trapetsoid rule (Eq. |6j seems to be the simplest one from the first outlook. The resulting matrices are 
well-structured with a regular hierarchy. On the other hand, in the basis set methods each element of the 
mesh can be processed using the same algorithm as long as all the elements are affine images of some basic 
reference element. This is especially easy in the wavelet method where the basis functions are obtained 
via a simple scaling methodology. When implementing a finite-element code, it is possible and important 
to reuse large parts of the general purpose finite-element codes which are freely available to the general 
public, such as those of Refs. li,15..14J . When basic routines are recycled, the implementation of an FE- 
approach becomes very simple as well. 

From the viewpoint of linear algebra all the methods have rather similar characteristics'^ They all result 
in sparse matrices whose condition number is governed by the underlying differential operator. They all 

Note however, that whereas the FE-approach leads to symmetric matrices for the discretized Hamihonian H, Laplacian A and 
overlap matrix S, the FD-method tends to make H and/or A nonsymmetric in the case where adaptive coordinates or generalized 
FD-schemes of l29ll?0lBl'l are applied, respectively. 
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allow the use of multilevel and domain decomposition methods with effective preconditioners. The major 
differences in favor of FD appear only for very large systems with several hundred million degrees of 
freedom when the well-structured nature of the finite-difference grid makes it possible to apply the matrix 
on a vector in an iterative scheme without explicitly storing it at any stage of the computation. In our FE- 
examples presented below, where second and fourth order tetrahedral meshes were considered, the storage 
of the necessary matrices (not counting pseudotentials as dense blocks) required memory equivalent to the 
storage of the order of one hundred eigenfunctions. This remains the case for larger systems as well, thus 
the relative extra storage becomes increasingly insignificant as larger systems are addressed, in the typical 
applications where a set of all occupied (and possibly some unoccupied) states need to be calculated. 

7.2 Quantitative comparisons 

It is difficult to make objective comparisons between two different discretization methods based on two 
different implementations as computer programs. There are very many parameters that affect the efficiency 
of the programs, and in the end, the user is only interested in the cpu-time required for a calculation with 
given accuracy. An objective comparison of the efficiencies of the discretization step, however, has to focus 
on the size of the matrix problem^" produced, rather than on the cpu-time spent by the entire calculation 
(unless exactly the same implementation is used for the solution of the resulting matrix problem). The 
cpu-time spent at the solution step then depends, apart from the size of the matrix problem, also on the 
implementation details and on the algorithms chosen for the eigenproblem, linear systems of equations, 
and mixing. 

The dipole-moment of the carbon monoxide molecule^' is a well known basis-set sensitive quantity. 
Even with the high-order FD-method and the straightforward trapetsoid rule of Eq. |6l a rather small grid 
spacing of 0.2 a.u. is required for a converged result (with a tolerance of 0.01Z3) when norm conserving 
Trollier-Martins (TM) psedopotentials are used^^ 11791 . When the computational volume is a sphere of 
radius 10 a.u., this results in 520 000 grid points. A smaller sphere results in loss of accuracy for the value 
of the dipole moment. Interestingly, the number of degrees of freedom in this calculation is greater than the 
number of the 39 000 quadratic basis functions in the all-electron FE-calculation within a sphere of radius 
20 a.u., reported in Sec. 15.41 There is no essential difference in the average number of nonzero matrix 
elements per row in the two calculations. This number was 42 in the FD case"'and 55 in the EE case. 

In Sec 15.51 and Ref. 1 140| a thin Hf02 layer was modelled with Green's functions, TM pseudopoten- 
tials and fourth order p-elements. The unit cell, periodic in two dimensions, consisted of 85 atoms, with 
dimensions of 27 x 21 x 21aff. 28 000 basis functions were considered sufficient for converged results^^. 
This was a transport calculation where Green's functions were evaluated, and we have no experience on 
the equivalent calculations using FD-methods. However, extrapolating from experience with Kohn-Sham- 
equations, and also in light of the previous example it is plausible to conclude that the grid spacing of 0.75 
a.u., that would result in 28 000 grid points in this case, can hardly give reliable results. 



The size of a sparse N X N matrix A naturally involves, besides the number of rows A'^, also the average number M of 
nonzero elements per row in A. When A involves pseudopotential operators, it becomes also relevant whether these are stored as 
dense blocks, as is necessary in the context of direct methods (Sec. 13.1.21 or if only the projector vectors are stored, as is common in 
FD-methods, and the pseudopotential part of the Hamiltonian is evaluated in a "matrix free" manner 
^' Within the local density approximation, its basis-set Umit is — 0.226D 1 136|. 

The required grid-spacing is dictated by the specific choice of the pseudopotential, in this example = r-p = l.Sa.u. was 
used for both C and O. Admittedly, it may be possible to find a pseudopotential which allows a slightly larger grid spacing for a 
converged result. Especially so, if the KB-pseudopotential is replaced by the PAW formalism. 

37 matrix elements per row for the kinetic energy operator with N=6 in Eq.|5]plus two additional dense blocks for the pseu- 
dopotentials. The dense blocks were not stored in the actual implementation. 

The matrix for the Laplacian had 75 nonzero elements per row. In the Hamiltonian matrix, the pseudopotentials were stored 
as dense blocks as necessitated by the direct method of Sec. 13.1.21 and thus 893 matrix elements per row were needed. In other 
applications, it may be more efficient to just store the projector functions for the pseudopotentials, which is the standard method also 
in FD-appHcations. 
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7.3 Generalizing the double grid method 

At the end of Sec. l2.1l we recognized, that the double-grid method of Ono and Hirose 1161 in fact amounts 
to computing the matrix elements of the pseudopotential operator in a special basis set arising from the 
polynomial interpolation used. We repeat here the formula for the matrix elements of the pseudopotential 
operator 

^ijM'j'fe' ^^^ciXHjkXH'j'k' ^ / 0^jfc(r)y"V^'J'fc'(r)dr, (43) 

where the required integrals are approximated by a trapetsoid rule on a fine uniform grid. For the sake of 
consistency, it is tempting to ask if not the same idea could be appUed to the entire hamiltonian H, i.e., 
replace it with a discrete version 

Hijk,i'j'k= J (f>ijk{r)H(l)i'j'k'{r)dr. (44) 

Indeed, this can be done, but then we must keep in mind that the basis set of (j)ijk'& is not orthonormal, but 
their overlap matrix 

Sijk,i'j'k= J (f>ijk{r)(t>i'j'k'{r)dr. (45) 

also enters the formalism^^, and we end up with the matrix equation He — eSc for the unknown coefficients 
in the expansion 

V'(r) = ^Cijk(t)ijk{r). (46) 

ijk 

The procedure outlined above can be recognized as a variational method of Sec. l2.2l with the basis set of 
4>ijk^ spanning both the trial and test spaces. Thus the double grid method itself is somewhere between the 



straightforward FD-scheme of Eqs \5\6\ and the finite-element method. Some similarities with the present 
considerations can be seen in the approach based on the "Galerkin transfer", outlined in Refs. 1641 1651 . 
where a discretization on a grid was obtained from an FD-discretization on an auxiliary finer grid, taking 
advantage of the prolongator and restrictor operarors from the multigrid formalism. The present approach 
can be seen as a limiting case, where the fine grid is replaced by continuous space. 

We add two more comments, i) When numerically evaluating integrals such as Eqs. (|8] |^ I45> the 
trapetsoid rule with uniform sampling is not the most efficient method"^. The integrands have support 
only within an orthorhombic volume, and very efficient numerical cubature formulas within such standard 
volumes are well known 1 182|. ii) Although the thought experiment above gave a continuous path from 
the double-grid method to the FE-method with a basis consisting of the functions 4)ijk, we do not suggest 
extending this basis set to very high order polynomials as it may result in matrices with high condition 
number. 

Other bases of low order polynomial product form are e.g. the "blip"-function basis (in the special case 
of fc = 0) of Hernandez and Gillan 1111 as well as the polynomial product basis of Tsuchida and Tsukada 
11831 . The nonseparable cubic "serendipity" element fjT] gives rise to a useful basis set which is no longer 
of product form, but still is associated with a simple uniform grid structure. To complete our awkward 



This reasoning also reveals, that the original implementation of the double grid method, where no S matrix appears, may be 
improvable also if an orthogonal set of 4>ijk^ can be found. 

The question of numerical integration always appears when a basis set is introduced in DFT calculations. For example, within 
the package Amsterdam Density Functional (ADF), Slater-type basis funcions are used. The numerical integration routines 1 180 18lJ 
play a central role in making ADF one of the most accurate codes in the market. 
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transition from the double grid method to the state-of the art FE-methods I34ll35ll36l . we repeat here that 
the tetrahedral p-element basis [39' "1401 discussed in Sec. 12.3. f1 associated with a good mesh generator 
such as the rapidly developing open source Netgen-package I.14J grants us, in principle, with complete 
flexibility in locally selecting the polynomial order p as well as the element size h. 

8 Summary 

In this paper we have discussed our efforts to develop systematic real-space methods as alternatives to the 
standard plane-wave and atom-centered basis function methods. We aim at a general purpose infrastructure 
for electronic structure calculations based on the Kohn-Sham density-functional theory 1 2 , 1 1 and its time- 
dependent generalization 1 3 1 . Our first line of work is the MIKA-project. In its early stages, this project has 
been rather tightly bound to the finite-difference method, the Rayleigh-quotient multigrid method, and the 
MIKA program package. In the near future the main development effort is directed to the implementation 
of software tools for time-dependent density-functional theory within the GridPaw-code 1971 . It is based 
on a different implementation of the FD method. Its attractive feature is a successful FD-implementation 
of the Projector Augmented Wave method, originally developed in the plane-wave context, which allows 
the use of relatively coarse uniform FD-grids for the accurate description of a wide range of elements. 

The second theme of this paper has been the promotion of the applications of FE methods within elec- 
tronic structure calculations. This work extends the scope of the MIKA project. Nowadays, there are 
general-purpose open-source FE-packages available 1141 1151 . that are extremely well suited as starting 
points for the development of solvers for the Kohn-Sham equations. Thus the argument that implemen- 
tation of FE-methods for electronic structure problems is more difficult than that of the FD-methods has 
to be reconsidered. We have compared a very preliminary all-electron example calculation for the car- 
bon monoxide molecule^^ with a state-of-the-art pseudopotential FD-calculation, in terms of the size of 
the matrix problem and seen that the spatially adaptive FE-discretization yields a matrix problem which 
is more than an order of magnitude smaller Further reductions in the size of the problem are expected 
when pseudopotentials or the PAW method is implemented, and/or full advantage is taken of hp-FEM 
methods, where spatial variations in the order p of the piecewise polynomial basis functions are allowed 
simultaneously with variations in the element size h. 

Reductions in the CPU-time required by the finite-element ground-state computations may also be avail- 
able via a careful selection of the eigenproblem solver and mixing scheme. However, as the ultimate goal 
is to utilize also the FE-approach in real-time propagation within TDDFT, the optimal performance of the 
ground state solver is of secondary importance (as long as it delivers highly accurate results - this may 
be critical for the stability of the time propagation - and as it is reliable in the sense that not too much 
labour of the users needs to be invested in obtaining a convergent ground state calculation) since the time 
propagation will always consume the dominant part of the cpu-time. The reduction of the size of the 
Hamiltonian matrix is directly relevant also for the performance of time propagation. From a somewhat 
more philosophical point of view, we have contemplated on the now popular double grid method due to 
Ono and Hirose 1 16 1, and seen it as the first step in a gradual transformation from the FD-method to the 
FE-method. 

The third line of work discussed in this paper is the development of a program package based on the 
wavelet approach. This effort is still at a very preliminary stage (thus far only spherically symmetric, 
computationally one-dimensional results have been reported), and it has been included here mainly with 
the dual aim of increasing the level of wavelet awareness among the authors of the present paper and of 
stimulating discussions with the various wavelet specialists. Based on the current results, it is obvious that 
our wavelet package has not yet reached a level of maturity comparable to that of the FD and FE packages. 
Thus we will base our TDDFT developments on either one, or both, of these two methods. 



" We have also performed an all-electron calculation with multimesh preconditioning for the Ceo molecule, demonstrating that 
our scheme is not limited to small systems. 
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